Best Evidence 

Encyclopedia (BEE) 

Empowering Educator* with Evidence on Proven Program* www. bcetovidcn.cc.org 



Effective Reading Programs for the 
Elementary Grades: 

A Best-Evidence Synthesis 

Robert E. Slavin 
Johns Hopkins University 
-and- 

University of York 

Cynthia Lake 
Johns Hopkins University 

Bette Chambers 
University of York 
-and- 

Johns Hopkins University 

Alan Cheung 
Johns Hopkins University 

Susan Davis 

Success for All Foundation 



January, 2010 



This research was funded by the Institute of Education Sciences, U.S. Department of 
Education (Grant No. R305A040082). However, any opinions expressed are those of the authors 
and do not necessarily represent IES positions or policies. 

We tha nk Marilyn Adams, Steven Ross, Michael McKenna, Henry Becker, and Nancy 
Madden for comments on an earlier draft. 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven 
Reform in Education (CDDRE) under funding from the Institute of Education Sciences, U.S. Department of Education. 



Abstract 



This article systematically reviews research on the achievement outcomes of four types of 
approaches to improving the reading success of children in the elementary grades: reading 
curricula, instructional technology, instructional process programs, and combinations of curricula 
and instructional process. Study inclusion criteria included use of randomized or matched 
control groups, a study duration of at least 12 weeks, valid achievement measures independent of 
the experimental treatments, and a final assessment at the end of grade 1 or later. A total of 63 
beginning reading (starting in K or 1) and 79 upper elementary (2-5) reading studies met these 
criteria. The review concludes that instructional process programs designed to change daily 
teaching practices have substantially greater research support than programs that focus on 
curriculum or technology alone. 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven 
Reform in Education (CDDRE) under funding from the Institute of Education Sciences, U.S. Department of Education. 



From the first day of kindergarten to the last day of elementary school, children 
substantially define themselves as readers, and this has enormous influence on their development 
as learners and as members of society. Those who succeed in becoming fluent, strategic, and 
joyful readers are not guaranteed success in school or in life, but they are well on their way. 
However, those who do not succeed in reading, or who become reluctant readers, face long odds 
in achieving success in school and life. Every educator, parent, and policy maker knows the 
critical importance of reading in the elementary grades. Further, the gap in reading performance 
between different ethnic groups, and between middle class and disadvantaged children, is 
perhaps the most important policy issue in education in the U.S. Because of the obvious 
importance of success in reading, schools invest enormous sums in initial teaching of reading 
and in remedial services for struggling readers. 

Given the great importance of success in reading for millions of children and for our 
society as a whole, one would imagine that there would be a great deal of research on how 
teachers can most effectively teach children to read. There is in fact a great deal of basic research 
on reading, and we know a lot about how children learn to read and what goes wrong when they 
fail to learn (see for example National Reading Panel, 2000; Snow, Burns, & Griffin, 1998; 
National Early Literacy Panel, 2008). Yet there is much less research evaluating the practical 
programs actually available to schools and teachers to ensure reading success, and the research 
that does exist has not been comprehensively reviewed. 

It is useful, for example, to know that effective beginning reading programs emphasize 
phonemic awareness, phonics, fluency, vocabulary, and comprehension, as concluded by the 
National Reading Panel (NRP, 2000). Reviews by Adams (1990) and by Snow, Burns, & Griffin 
(1998), as well as the NRP, have supported the importance of teaching with a strong emphasis on 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven 
Reform in Education (CDDRE) under funding from the Institute of Education Sciences, U.S. Department of Education. 



phonics and phonemic awareness. Yet school leaders and teachers do not choose between 
“phonics” and “no phonics.” Instead, they choose among particular textbooks, software, and 
professional development approaches. Any particular program may incorporate the five NRP 
elements to a greater or lesser degree, but each also incorporates other features (such as 
classroom organization, motivation, grouping, assessment, and professional development) that 
also determine the outcomes of the program. 

The importance of focusing attention on all aspects of reading approaches, not just on 
phonics or other NRP elements, was illustrated by the experience of the federal Reading First 
program. Based in large part on the findings of the National Reading Panel (2000) and earlier 
research syntheses, the Reading First program favored phonics and phonemic awareness, and a 
national study of Reading First by Gamse et al. (2008) and Moss et al. (2008) found that teachers 
in Reading First schools were in fact doing more phonics teaching than were those in similar 
non-Reading First schools. Yet outcomes were disappointing, with small effects seen on first 
grade decoding measures and no impact on comprehension measures in grades 1-3. Similarly, a 
large study of intensive professional development focusing on phonics found no effects on the 
reading skills of second graders (Garet et al., 2008). The findings of these large-scale 
experiments imply that while the importance of phonics and phonemic awareness in reading 
instruction are well established, the addition of phonics to traditional basal instruction is not 
sufficient to bring about widespread improvement in children’s reading. Other factors, especially 
relating to teaching methods, are also consequential. 

The What Works Clearinghouse (WWC, 2009), in its beginning reading topic report, 
reviewed research on reading programs evaluated in grades K through 3. However, the WWC 
only reports program ratings, and does not include discussion of the findings or draw 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven 
Reform in Education (CDDRE) under funding from the Institute of Education Sciences, U.S. Department of Education. 



generalizations about the effects of types of programs. Further, WWC inclusion standards 
applied in its beginning reading topic report include very brief studies (as few as 5 hours of 
instruction), and studies that used measures of skills taught in experimental but not control 
groups. It does not weight by sample size, and many of its conclusions are based on atypical 
effect sizes from studies with sample sizes as small as 46 (see Slavin, 2008). 

The present article reviews research on the achievement outcomes of practical initial 
(non-remedial) reading programs for all elementary children, grades K-5, applying consistent 
methodological standards to the research. It is intended to provide fair summaries of the 
achievement effects of the full range of reading approaches available to educators and policy 
makers, and to summarize for researchers the current state of the art in this area. The scope of the 
review includes all types of programs that teachers, principals, or superintendents might consider 
to improve the success of their children in reading: curricula, instructional technology, 
instructional process programs, and combinations of curricula and instructional process. The 
review uses a form of best evidence synthesis (Slavin, 1986), adapted for use in reviewing “what 
works” literatures in which there are generally few studies evaluating each of many programs. It 
is part of a series, all of which used the same methods with minor adaptations. Separate 
syntheses review research on remedial, preventive, and special education programs in elementary 
reading (Slavin, Lake, Davis, & Madden, 2009), middle and high school reading programs 
(Slavin, Cheung, Groff, & Lake, 2008), and reading programs for English language learners 
(Cheung & Slavin, 2005). 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven 
Reform in Education (CDDRE) under funding from the Institute of Education Sciences, U.S. Department of Education. 



Focus of the Current Review 



The present review uses procedures similar to those used in the secondary reading review 
to examine research on initial (non-remedial) programs for elementary reading. The purpose of 
the review is to place all types of initial reading programs intended to enhance reading 
achievement on a common scale, to provide educators and policy makers with meaningful, 
unbiased information that they can use to select programs most likely to make a difference with 
their students. The review emphasizes practical programs that are or could be used at scale. It 
therefore emphasizes large studies done over significant time periods that used standard 
measures, to maximize the usefulness of the review to educators. The review also seeks to 
identify common characteristics of programs likely to make a difference in reading achievement. 
This synthesis was intended to include all kinds of approaches to reading instruction, and groups 
them in four categories: reading curricula, instructional technology, instructional process 
programs, and combinations of reading curricula and instructional process. Reading curricula 
primarily encompass core reading textbooks and curricula, such as Reading Street and Open 
Court Reading. Instructional technology refers to programs that use technology to enhance 
reading achievement. This includes traditional supplementary computer-assisted instruction 
(CAI) programs, in which students are sent to computer labs for additional practice. Other 
instructional technology programs include Reading Reels, which provides embedded multimedia 
in daily lessons, and Writing to Read, which combines technology and non-technology small 
group activities. Instructional process programs rely primarily on professional development to 
give teachers effective strategies for teaching reading. These include programs focusing on 
cooperative learning, such as PALS and CIRC, and programs focusing on phonics and 
phonological awareness. Curriculum and instructional process programs, specifically Success 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven 
Reform in Education (CDDRE) under funding from the Institute of Education Sciences, U.S. Department of Education. 



for All and Direct Instruction, provide specific phonetic curricula as well as extensive 
professional development focused on instructional strategies. Comprehensive school reform 
(CSR) programs were included only if they included specific reading programs; for a broader 
review of outcomes of elementary CSR models, see CSRQ (2006) and Borman et al. (2003). 

Methodolosical Issues Characteristic of Elementary Reading Research 

While this review of research on reading programs shares methodological issues common 
to all systematic reviews, there are also some key issues unique to this subject and grade level. 
The thorniest of these relates to measurement. In the early stages of reading, researchers often 
use measures such as phonemic awareness that are not “reading” in any sense, though they are 
precursors. However, measures of reading comprehension and reading vocabulary tend to have 
floor effects at the kindergarten and first grade levels. The present review included measures 
such as letter- word identification and word attack, but did not accept measures such as auditory 
phonemic awareness. Measures of oral vocabulary, spelling, and language arts were excluded at 
all grade levels. 

Another problem of early reading measurement is that in kindergarten, it is possible for a 
study to find positive effects of programs that introduce skills not ordinarily taught in 
kindergarten on measures of those skills. For example, until the late 1990’s it was not common in 
U.S. kindergartens for children to be taught phonics or phonemic awareness. Programs that 
moved these then first-grade skills into kindergarten might appear very effective in comparison 
to control classes receiving little or no instruction on those skills, but would in fact simply be 
teaching skills the control children would probably have mastered somewhat later. 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven 
Reform in Education (CDDRE) under funding from the Institute of Education Sciences, U.S. Department of Education. 



Because of the difficulty of defining and measuring early literacy skills, multi-year 
evaluations of programs that may begin in kindergarten, but follow children at least through the 
end of first or second grade are of particular value. By the end of second grade, it is certain that 
control students as well as experimental students have been seriously taught to read, and it 
becomes possible to use measures of reading comprehension and reading vocabulary that more 
fully represent the goals of reading instruction, not just precursors. Multi-year studies solve the 
problem of early presentation of skills ordinarily taught later. If kindergartners are taught certain 
first grade reading skills, end of first grade or second grade measures should be able to determine 
if this early teaching was truly beneficial. Due to the unique nature of research on kindergarten- 
only programs, studies whose final posttesting took place before spring of first grade are 
reviewed in a separate section of this article. 

Review Methods 

As noted earlier, the review methods used here are adaptations of a technique called best- 
evidence synthesis (Slavin, 1986, 2008). Best-evidence syntheses seek to apply consistent, well- 
justified standards to identify unbiased, meaningful information from experimental studies, 
discussing each study in some detail, and pooling effect sizes across studies in substantively 
justified categories. The method is very similar to meta-analysis (Cooper, 1998; Lipsey & 
Wilson, 2001), adding an emphasis on narrative description of each study’s contribution. It is 
similar to the methods used by the What Works Clearinghouse (2009), with a few important 
exceptions noted in the following sections. See Slavin (2008) for an extended discussion and 
rationale for the procedures used in all of these reviews. 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven 
Reform in Education (CDDRE) under funding from the Institute of Education Sciences, U.S. Department of Education. 



Literature Search Procedures 



A broad literature search was carried out in an attempt to locate every study that could 
possibly meet the inclusion requirements. Electronic searches were made of educational 
databases (JSTOR, ERIC, EBSCO, Psych INFO, Dissertation Abstracts) using various 
combinations of key words (for example, “elementary students,” “reading,” “achievement”) and 
the years 1970-2009. Results were then narrowed by subject area (for example, “reading 
intervention,” “educational software,” “academic achievement,” “instructional strategies”). In 
addition to looking for studies by key terms and subject area, we conducted searches by program 
name. Web-based repositories and education publishers’ websites were also examined. We 
attempted to contact producers and developers of reading programs to check whether they knew 
of studies that we had missed. Citations were obtained from other reviews of reading programs 
including the What Works Clearinghouse (2009) beginning reading topic report, Adams (1990), 
National Reading Panel (2000), Snow, Burns & Griffin (1998), Torgerson, Brooks, & Hall 
(2006), Rose (2006), and August & Shanahan (2006), or potentially related topics such as 
instructional technology (E. Chambers, 2003; Kulik, 2003; Murphy et al., 2002). We also 
conducted searches of recent tables of contents of key journals. We searched the following 
tables of contents from 2000 to 2009: American Educational Research Journal, Reading 
Research Quarterly, Journal of Educational Research, Journal of Educational Psychology, 
Reading and Writing Quarterly, British Educational Research Journal, and Learning and 
Instruction. Citations of studies appearing in the studies found in the first wave were also 
followed up. 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven 
Reform in Education (CDDRE) under funding from the Institute of Education Sciences, U.S. Department of Education. 



In general, effect sizes were computed as the difference between experimental and 
control individual student posttests after adjustment for pretests and other covariates, divided by 
the unadjusted posttest control group standard deviation. If the control group SD was not 
available, a pooled SD was used. Procedures described by Lipsey & Wilson (2001) and 
Sedlmeier & Gigerenzor (1989) were used to estimate effect sizes when unadjusted standard 
deviations were not available, as when the only standard deviation presented was already 
adjusted for covariates or when only gain score SD’s were available. If pretest and posttest 
means and SD’s were presented but adjusted means were not, effect sizes for pretests were 
subtracted from effect sizes for posttests. In multiyear studies, effect sizes may be reported for 
each year but only the final year of treatment is presented in the tables. However, if there are 
multiple cohorts (e.g., K-l, K-2, K-3), each with adequate pretests, all cohorts are included in the 
tables. 

Effect sizes were pooled across studies for each program and for various categories of 
programs. This pooling used means weighted by the final sample sizes. The reason for using 
weighted means is to maximize the importance of large studies, as the previous reviews and 
many others have found that small studies tend to overstate effect sizes (see Rothstein et ah, 
2005; Slavin & Smith, in press). 

Effect sizes were broken down for measures of decoding (e.g., word attack, letter-word 
identification, and fluency), vocabulary, and comprehension/total reading. In general, 
comprehension, which is the ultimate goal of reading instruction, is the most important outcome 
measure. Very few studies reported separate vocabulary scores, so the tables only show separate 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven 
Reform in Education (CDDRE) under funding from the Institute of Education Sciences, U.S. Department of Education. 



outcomes for decoding and comprehension (although vocabulary measures are included in 
totals). 

Criteria for Inclusion 

Criteria for inclusion of studies in this review were as follows. 

1. The studies evaluated initial (i.e., non-remedial) classroom programs for elementary 
reading. Studies of variables, such as use of ability grouping, block scheduling, or single- 
sex classrooms, were not reviewed. Studies of tutoring and remedial programs for 
struggling readers are reviewed in a separate article (Slavin et ah, 2009). 

2. The studies involved interventions that began when children were in elementary school, 
grades K-5. As noted earlier, studies that began and ended in kindergarten are reviewed 
separately. Programs beginning in K or 1 were categorized as beginning reading, while 
those beginning in 2-5 were categorized as upper elementary. 

3. The studies compared children taught in classes using a given reading program to those in 
control classes using an alternative program or standard methods. 

4. Studies could have taken place in any country, but the report had to be available in 
English. 

5. Random assignment or matching with appropriate adjustments for any pretest differences 
(e.g., analyses of covariance) had to be used. Studies without control groups, such as pre- 
post comparisons and comparisons to “expected” scores, were excluded. 

6. Pretest data had to be provided, unless studies used random assignment of at least 30 
units (individuals, classes, or schools) and there were no indications of initial inequality. 
Studies with pretest differences of more than 50% of a standard deviation were excluded 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven 
Reform in Education (CDDRE) under funding from the Institute of Education Sciences, U.S. Department of Education. 



because, even with analyses of covariance, large pretest differences cannot be adequately 
controlled for as underlying distributions may be fundamentally different (Shadish, Cook, 
& Campbell, 2002). 

7. The dependent measures included quantitative measures of reading performance, such as 
standardized reading measures. Experimenter-made measures were accepted if they were 
comprehensive measures of reading, which would be fair to the control groups, but 
measures of reading objectives inherent to the experimental program (but unlikely to be 
emphasized in control groups) were excluded. Studies using measures inherent to 
treatments, usually made by the experimenter or program developer, have been found to 
be associated with much larger effect sizes than are measures that are independent of 
treatments (Slavin & Madden, in press), and for this reason, effect sizes from treatment- 
inherent measures were excluded. The exclusion of measures inherent to the experimental 
treatment is a key difference between the procedures used in the present review and those 
used by the What Works Clearinghouse (2009). Measures of reading individually 
administered by the children’s own teachers were also excuded, on the basis that such 
assessments are susceptible to bias. As noted above, measures of pre-reading skills such 
as phonological awareness, as well as related skills such as oral vocabulary, language 
arts, and spelling, were not included in this review. 

8. A minimum study duration of 12 weeks was required. This requirement is intended to 
focus the review on practical programs intended for use for the whole year, rather than 
brief investigations. Study duration is measured from the beginning of the treatments to 
posttest, so, for example, an intensive 8-week intervention in the fall of first grade would 
be considered a year-long study if the posttest were given in May. The 12 -week criterion 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven 
Reform in Education (CDDRE) under funding from the Institute of Education Sciences, U.S. Department of Education. 



has been consistently used in all of the systematic reviews done previously by the current 
authors. This is another difference between the current review and the What Works 
Clearinghouse (2009) beginning reading topic report, which included very brief studies. 

9. Studies had to have at least 15 students and two teachers in each treatment group. 

Limitations 

It is important to note several limitations of the current review. First, the review focuses 
on experimental studies using quantitative measures of reading. There is much to be learned 
from qualitative and correlational research that can add depth and insight to understanding the 
effects of reading programs, but this research is not reviewed here. Second, the review focuses 
on replicable programs used in realistic school settings expected to have an impact over periods 
of at least 12 weeks. This emphasis is consistent with the review’s purpose in providing 
educators with useful information about the strength of evidence supporting various practical 
programs, but it does not attend to shorter, more theoretically-driven studies that may also 
provide useful information, especially to researchers. Finally, the review focuses on traditional 
measures of reading performance, primarily individually-administered or group-administered 
standardized tests. These are useful in assessing the practical outcomes of various programs and 
are fair to control as well as experimental teachers, who are equally likely to be trying to help 
their students do well on these assessments. The review does not report on experimenter-made 
measures of content taught in the experimental group but not the control group, even though 
results on such measures may also be of importance to some researchers or educators. 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven 
Reform in Education (CDDRE) under funding from the Institute of Education Sciences, U.S. Department of Education. 



Categories of Research Design 



Four categories of research designs were identified. Randomized experiments (R) were 
those in which students, classes, or schools were randomly assigned to treatments, and data 
analyses were at the level of random assignment. When schools or classes were randomly 
assigned but there were too few schools or classes to justify analysis at the level of random 
assignment, the study was categorized as a randomized quasi-experiment (RQE) (Slavin, 2008). 
Matched (M) studies were ones in which experimental and control groups were matched on key 
variables at pretest, before posttests were known, while matched post-hoc (MPH) studies were 
ones in which groups were matched retrospectively, after posttests were known. Studies using 
fully randomized designs (R) are preferable to randomized quasi-experiments (RQE), but all 
randomized experiments are less subject to bias than matched studies. Among matched designs, 
prospective designs (M) were preferred to post-hoc matched designs (MPH). In the text and in 
tables, studies of each type of program are listed in this order (R, RQE, M, MPH). Within these 
categories, studies with larger sample sizes are listed first. Therefore, studies discussed earlier in 
each section should be given greater weight than those listed later, all other things being equal. 

For Additional In formation 

The following sections present summaries of findings and tables showing characteristics 
and findings of individual studies. Descriptions of individual studies have been withheld to meet 
the page limits of this journal, but can be seen in an online version at www.bestevidence.org. The 
web site presents reviews separately for beginning and upper-elementary reading. The web 
versions also include appendices listing all relevant studies excluded from the review and the 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven 
Reform in Education (CDDRE) under funding from the Institute of Education Sciences, U.S. Department of Education. 



reasons for exclusion, as well as overall ratings of the strength of the evidence supporting use of 
individual programs. 



Beginning Reading 

From the first day of kindergarten to the last day of first grade, most children go through 
an extraordinary transformation as readers. If all goes well, children at the end of first grade 
know the sounds of all the letters and can form them into words, know the most common sight 
words, and can read and comprehend simple texts. The K-l period is distinct from other stages of 
reading development because during this stage, children are learning all the basic skills of 
turning print into meaning. From second grade on, children build fluency, comprehension, and 
vocabulary for reading ever more complex text in many genres, but the K- 1 period is 
qualitatively different in its focus on basic skills. The following sections summarize research on 
programs for beginning reading. 

Research on Beginning Reading Curricula 

The reading curricula category consists of textbooks for initial (non-remedial) reading 
instruction. Some professional development is typically provided with these textbooks, but far 
less than would be typical of instructional process approaches. 

Table 1 summarizes descriptions and outcomes of all studies of curriculum programs for 
beginning reading. 



TABLE 1 HERE 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven 
Reform in Education (CDDRE) under funding from the Institute of Education Sciences, U.S. Department of Education. 



Beginning reading curricula have been evaluated in seven studies, five of which used 
randomized quasi-experiments. 

These studies evaluated three core basal reading programs, Open Court Reading, 
Reading Street, and Scholastic Phonics Readers with Literacy Place, plus three supplemental 
programs, the Open Court Phonics Kit, Phonics in Context, and Elements of Reading: Phonics 
and Phonemic Awareness. The sample size-weighted mean effect size across all seven was 
+0.12, with the four studies of core basal programs reporting a weighted mean effect size of 
+0.1 1 and the three studies of supplementary programs with a weighted mean of +0.12. Effect 
sizes averaged +0.23 for decoding measures, but only +0.09 for comprehension/total reading 
measures. 

Research on Instructional Technolog\: For Besinnins Reading 

The effectiveness of instructional technology (IT) has been extensively debated over the 
past 20 years, and there is a great deal of research on the topic. Kulik (2003) concluded that 
research did not support use of IT in elementary or secondary reading, although E. Chambers 
(2003) came to a somewhat more positive conclusion. 

Thirteen studies of instructional technology for beginning reading met the standards for 
the present review. These were divided into three categories. Supplemental technology 
programs, such as Waterford, WICAT, and Phonics-Based Reading, are programs that provide 
additional instruction at students’ assessed levels of need to supplement traditional classroom 
instruction. Mixed-method models, represented by Writing to Read, are methods that use 
computer- assisted instruction along with non-computer activities as students’ core reading 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven 
Reform in Education (CDDRE) under funding from the Institute of Education Sciences, U.S. Department of Education. 



approach. Embedded multimedia, represented by Reading Reels, provides video content 
embedded in teachers’ whole-class lessons. 

Descriptions and outcomes of all studies of instructional technology in beginning reading 
that met the inclusion criteria appear in Table 2. 



TABLE 2 HERE 



The weighted mean effect size for all technology approaches in beginning reading was 
only +0.09 across 13 studies. A large, randomized study by Dynarski et al. (2007) and 
Campuzano et al. (2009) found no impact of five current supplemental CAI models. This study’s 
findings greatly affected the weighted mean of nine studies of supplementary CAI, estimated at 
+0.08. The weighted mean effect size for decoding measures, also substantially affected by the 
Dynarski/Campuzano findings, was only +0.05, although comprehension/total reading effects 
(not measured in the Dynarski/Campuzano study) averaged +0.20. Large effect sizes were 
reported in small, matched studies of Waterford and WICAT. Reading Reels, which uses 
multimedia embedded in teachers’ class lessons, had modest positive effects in two large 
randomized experiments (weighted mean ES=+0.20). With these potentially promising 
exceptions, research on the use of technology in beginning reading instruction does not show 
positive achievement effects of the types of software that have been most commonly used. 

Research on Instructional Process Programs for Besinnins Reading 

Instructional process programs are methods that focus on providing teachers with 
extensive professional development to implement specific instructional methods. These fell into 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven 
Reform in Education (CDDRE) under funding from the Institute of Education Sciences, U.S. Department of Education. 



three categories. Cooperative learning programs, (Slavin, 1995, 2009) use methods in which 
students work in small groups to help one another master academic content. Phonological 
awareness training is an approach that gives teachers specific classroom strategies for building 
phonics and phonemic awareness skills. Phonics-focused professional development models, 
including Reading and Integrated Literacy Strategies (RAILS), Sing, Spell, Read, and Write, 
Ladders to Literacy, Early Reading Research, and Orton Gillingham, provide training to 
teachers to help them effectively incorporate phonics, phonemic awareness, and other elements 
in beginning reading lessons. Note that two comprehensive programs combining instructional 
process approaches with innovative curricula, Success for All and Direct Instruction, are 
reviewed in a separate section of this article. 

Descriptions and outcomes of all studies of instructional process programs meeting the 
inclusion criteria appear in Table 3. 



TABLE 3 



Effects for instructional process programs were very positive. Across 17 studies, five of 
which were randomized quasi-experiments, the weighted mean effect size for instructional 
process approaches in beginning reading was +0.37. The mean was +0.47 for decoding measures 
and +0.30 for comprehension/total reading measures. In particular, positive effects were seen on 
cooperative learning programs such as Peer-Assisted Learning Strategies (PALS) and Classwide 
Peer Tutoring (mean ES=+0.46), phonics-focused professional development programs such as 
Sing, Spell, Read, and Write, Early Reading Research, and RAILS (mean ES=+0.43), and 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven 
Reform in Education (CDDRE) under funding from the Institute of Education Sciences, U.S. Department of Education. 



teaching of phonological awareness to kindergartners (mean ES=+0.22 on tests at the end of first 
or second grade). 



Research on Combined Curriculum and Instructional Process Approaches for Beginning 
Reading 

Evaluations of programs that provide complete curricula as well as extensive professional 
development in classroom instructional processes are summarized in Table 4. These consist of 
two programs, Success for All and Direct Instruction. 



TABLE 4 HERE 



Success for All (SFA) is a comprehensive school refonn program designed to ensure 
success in reading for children in high-poverty schools (Slavin, Madden, Chambers, & Haxby, 
2009). It provides schools with a K-5 reading curriculum that focuses on phonemic awareness, 
phonics, comprehension, and vocabulary development, beginning with phonetically-controlled 
mini-books in grades K-l. Cooperative learning is extensively used at all grade levels. 
Struggling students, especially first graders, receive one-to-one tutoring. Extensive professional 
development and a full-time facilitator help teachers effectively apply all program elements. 
Across 23 studies involving more than 12,000 children, the weighted mean effect size for 
Success for All was +0.29. On decoding measures the overall mean was +0.33, and the mean was 
+0.27 for comprehension/total reading. 

Dating back to the 1960’s, Direct Instruction (DI) is an approach to beginning reading 
instruction that emphasizes a step-by-step approach to phonics, decodable texts that make use of 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven 
Reform in Education (CDDRE) under funding from the Institute of Education Sciences, U.S. Department of Education. 



a unique initial teaching alphabet, and structured, scripted manuals for teachers. Across three 
evaluations of Direct Instruction, the weighted mean effect size for beginning reading was +0.10. 
However, it is important to note that in other reviews that examined effects of DI in all 
elementary grades (not just K-l), this program has been rated as among the strongest in reading 
outcomes (e.g., Herman, 1999; Borman et ah, 2003; CSRQ, 2006). 

Kindergarten-Only Studies 

As noted earlier, studies that take place only during kindergarten can pose serious 
methodological challenges. Because the goals of kindergarten instruction vary a great deal from 
place to place, and have changed dramatically over the past 30 years, it is always possible that 
any experimental-control difference on an end-of-kindergarten reading measure is simply due to 
the fact that the control group was not being taught to read. Even when reading is being taught, 
kindergarten classes can vary greatly in their emphasis on phonics, so measures of word attack 
and phonological awareness can be easily inflated by programs that focus on these skills earlier 
than the control treatment does. Still, it is useful to know about kindergarten-only studies, as they 
can provide initial indications of programs worth following through to first grade and beyond. 

Thirteen studies met the standards of the review but took place only during the 
kindergarten year. These are summarized in Table 5. 



TABLE 5 HERE 



The kindergarten-only studies generally support the conclusions of the studies that follow 
children through first grade and beyond. It is important to note that many of the programs cited 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven 
Reform in Education (CDDRE) under funding from the Institute of Education Sciences, U.S. Department of Education. 



in the main review, which tested children at the end of first grade, also reported very positive 
outcomes during kindergarten. These are also programs with a strong emphasis on phonics 
and/or cooperative learning, including Success for All (e.g., Jones et ah, 1997), and the 
phonological awareness training programs (e.g., Lundberg et ah, 1988). 

Overall Patterns of Outcomes: Beginning Reading 

Across all categories, there were 63 qualifying studies of beginning reading programs 
that posttested children at the end of first grade or later. Nineteen of the studies used random 
assignment (8 were fully randomized and 1 1 were randomized quasi-experiments). The sample 
size-weighted mean effect size was +0.22. These studies, involving more than 22,000 children, 
were identified from among more than 2000 studies initially reviewed, and represent those that 
used rigorous experimental procedures. 

Overall effects were somewhat stronger for decoding measures (such as Woodcock Word 
Attack and Letter-Word Identification) than for measures of comprehension and total reading. 
Across all studies, the weighted mean effect size was +0.27 for decoding measures and +0.20 for 
comprehension/total reading. Comprehension measures were more likely to show positive effects 
in multiyear studies that followed children into second grade or beyond. 

There are several important patterns in the findings on beginning reading programs that 
are worthy of note. First, this article finds that successful programs almost always provide 
teachers with extensive professional development and followup focused on specific teaching 
methods. In particular, most of the beginning reading programs with strong evidence of 
effectiveness have cooperative learning at their core: Success for All, Peer-Assisted Learning 
Strategies, Reading Reels, and Classwide Peer Tutoring all emphasize children working with 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven 
Reform in Education (CDDRE) under funding from the Institute of Education Sciences, U.S. Department of Education. 



other children on structured activities. These are all forms of cooperative learning in which 
students work in small groups to help one another master reading skills, and in which the success 
of the team depends on the individual learning of each team member, the elements that previous 
reviewers (e.g., Rohrbeck et ah, 2003; Slavin, 1995, 2009; Webb, 2008) have identified as 
essential to the effectiveness of cooperative learning. 

Second, all of the beginning reading programs found to be effective or promising in 
qualifying experiments have a strong focus on teaching phonics and phonemic awareness. This is 
particularly true of Success for All, PALS, Reading Reels, phonological awareness training, Open 
Court Phonics Kits, Scholastic Phonics Readers with Literacy Place, Early Reading Research, 
Reading and Integrated Literacy Strategies (RAILS), Direct Instruction, and Phonics-Based 
Reading. It is important to note that studies of all of these programs found positive effects on 
comprehension and/or total reading measures, not just decoding measures that would appear 
more slanted toward phonetic approaches. However, an emphasis on phonics did not guarantee 
positive effects. Phonetic curricular approaches and supplemental computer-assisted instruction 
models, in particular, had minimal impacts on student outcomes. A large-scale evaluation of 
phonics-focused professional development by Garet et al. (2008) similarly found minimal effects 
for second graders. It clearly matters a great deal how reading is taught, and an emphasis on 
phonics may be necessary but it is not sufficient to ensure meaningful reading gains. 

One key implication of the Gamse et al. (2008) evaluation of Reading First is that it is not 
enough to encourage teachers to emphasize phonics, phonemic awareness, and other elements. 
The Moss et al. (2008) report that analyzed differences between Reading First and similar Title I 
schools that did not receive Reading First funding found that Reading First teachers were in fact 
spending more time teaching reading, and specifically more time on phonics, phonemic 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven 
Reform in Education (CDDRE) under funding from the Institute of Education Sciences, U.S. Department of Education. 



awareness, fluency, vocabulary, and comprehension. The Reading First teachers were 
significantly more likely to use basal textbooks that were revisions of traditional basals designed 
primarily to increase the focus on phonics and phonemic awareness. In order of popularity in 
Reading First schools, these were Harcourt Trophies (22.5% of RF, 15.0% of non-RF), Open 
Court Reading (15.4% vs. 9.8%), Scott Foresman Reading ( 13.0% vs. 12.2%), and Houghton 
Mifflin’s Nation ’s Choice (10.7% vs 2.5%). Yet none of these had ever been evaluated at the 
beginning of Reading First, and only Open Court Reading has been adequately evaluated since 
then, in a study that found modest impacts (ES=+0.17; Borman, Dowling, et ah, 2007). If 
adopting books with more phonics and spending a few more minutes each day on the five 
elements recommended by the National Reading Panel (2000) were sufficient to improve 
beginning reading performance, the Gamse et al. (2008) national evaluation would have found 
significant positive effects. The research summarized in the present review points in a different 
direction. It supports the use of well-developed programs that integrate curriculum, pedagogy, 
and extensive professional development. 

Upper Elementary Reading Programs 

From second to fifth grade, children go through a critical transformation as readers. Most 
beginning second graders are able to decode, to recognize key sight words, to comprehend 
simple texts, and to read with some degree of fluency. The tasks that lay ahead of them, 
however, are qualitatively different from those they have navigated so far. They must consolidate 
and extend their basic skills, to be sure, and they must become fluent, confident readers. But 
most importantly, children in the upper elementary grades must become strategic comprehenders 
of increasingly sophisticated text. They must build a vocabulary of words and concepts as well as 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven 
Reform in Education (CDDRE) under funding from the Institute of Education Sciences, U.S. Department of Education. 



a vocabulary of cognitive and metacognitive approaches to texts. While decoding skills may 
develop in a fairly step-by-step progression, the skills mastered in the upper elementary grades 
emerge as children read in many genres and leam how to make sense of what they read, a less 
straightforward process. Early decoding success is a key predictor of success in the upper 
elementary grades and beyond (e.g., Juel, 1988), yet there are many children who are successful 
decoders but poor comprehenders. This period is also distinct from the middle grades, when 
reading instruction is not typically taught as a separate subject but is subsumed in English or 
language arts. 

Because of the different objectives and requirements of the upper elementary grades, 
programs that are effective in building beginning reading skills are not necessarily optimal in the 
upper elementary grades, and vice versa. For this reason, in reviewing research on effective 
reading programs, it is important to review programs at each of these levels separately. This 
section focuses on studies of non-remedial classroom reading approaches that begin in grades 
2-5. 



Current Issues in Upper-Elementary Reading 

In recent years, reading in the upper elementary grades has taken on particular centrality 
because of the growing importance of test-based accountability. In the U.S., state accountability 
systems have long emphasized performance in grades 3-5 as the indicator of elementary school 
success, and in 2001, No Child Left Behind heightened this emphasis, requiring testing of 
reading and math in every grade from three to eight, and adding sanctions for schools not making 
adequate yearly progress. In England, Key Stage 2 assessments in reading and math in Year 6 
(age 11) are the main indicators of primary school success. 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven 
Reform in Education (CDDRE) under funding from the Institute of Education Sciences, U.S. Department of Education. 



Despite the obvious importance of upper-elementary reading for policy and practice, 
there has never been a review of research on effective programs at this grade level. The federal 
What Works Clearinghouse (2009) has created a topic report on beginning reading programs, 
and this synthesis included studies with students up to third grade. However, the WWC excluded 
studies that included grades above 3 if they did not analyze data separately for grades above and 
below third grade, and this excluded many upper-elementary studies that included grades 2-4, 3- 
5, and so on. At this writing, the WWC has not announced a plan to do an upper-elementary 
reading review. Deshler, Palincsar, Biancarosa, & Nair (2007) published a major “research-based 
guide to instructional programs and practices” for struggling adolescent readers. It contains brief 
discussions of the research evidence supporting each of 48 widely-used programs, as well as lists 
of articles for each, and many of the articles reported studies of grades 3-6. Yet Deshler et al. 
(2007) did not attempt to synthesize or compare the evidence bases for the programs at any grade 
level. 

The review of research on upper-elementary reading programs summarized in this section 
uses methods identical to those used in the beginning reading review, except that programs had 
to have begun in grades 2-5. This synthesis groups upper elementary reading programs in three 
categories, defined previously for beginning reading programs: reading curricula, instructional 
technology, and instructional process programs. Reading curricula primarily encompass core 
reading textbooks and curricula, such as Scott Foresman’s Reading Street, as well as 
supplementary texts such as Scholastic’s Fluency Formula. Instructional technology (IT) refers 
to programs that use technology to enhance reading achievement, especially computer-assisted 
instruction (CAI). Instructional process programs are the most diverse. All programs in this 
category rely primarily on professional development to give teachers effective strategies for 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven 
Reform in Education (CDDRE) under funding from the Institute of Education Sciences, U.S. Department of Education. 



teaching reading. These include programs focusing on cooperative learning, classroom 
motivation and management, and metacognitive strategies. Examples include Cooperative 
Integrated Reading and Composition, Peer-Assisted Learning Strategies (PALS), Exemplary 
Center for Reading Instruction ( ECR1 ), and Consistency Management-Cooperative Discipline 
(CMCD). 

Research on Upper Elementary Readins Curricula 

The reading curricula category includes 7 qualifying studies of core basal textbooks and 8 
studies of supplementary texts used as initial instruction with all students. Characteristics and 
findings of individual studies appear in Table 6. 



TABLE 6 HERE 



Both core and supplemental reading curricula for the upper-elementary grades have been 
studied in high-quality evaluations. Among 15 studies, there were five randomized experiments 
as well as four randomized quasi-experiments, involving more than 10,000 students. These 
studies found few effects on student reading achievement. The weighted mean effect size for 
core reading curricula was only +0.06, and for supplementary curricula it was +0.08, with an 
overall weighted mean of +0.06. The mean for the randomized studies and randomized quasi- 
experiments was +0.04. The only curriculum with promising effects was Open Court (average 
ES = +0.18), but in both of the studies of this program teachers received far more professional 
development than that usually provided, and in both studies Open Court was used for 2Vi hours 
per day while control students had 90 minutes of reading. 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven 
Reform in Education (CDDRE) under funding from the Institute of Education Sciences, U.S. Department of Education. 



Research on Instructional Technology Programs for Upper Elementary Grades 



Thirty-one studies of instructional technology for grades 2-6 met the standards for this 
review. These were divided into three categories. Supplemental CAI programs, such as 
Jostens/Compass Learning, Academy of Reading, LeapTrack, My Reading Coach, and 
CCC/Successmaker provided additional instruction at students’ assessed levels of need to 
supplement traditional classroom instruction. Computer-Managed Learning Systems included 
only Accelerated Reader. This program uses computers to assess students’ reading levels, assign 
reading materials at students’ levels, score tests on those readings, and chart students’ progress, 
but students do not work directly on the computer. Innovative Technology Applications included 
Fast For Word and Light span. 

Descriptions and outcomes of all studies of instructional technology in upper elementary 
reading that met the inclusion criteria appear in Table 7. 



TABLE 7 HERE 



Among the 3 1 qualifying upper-elementary studies that evaluated various forms of 
instructional technology, eight used random assignment to treatments. The studies involved a 
total of more than 10,000 students. Overall, the sample size-weighted mean effect size was very 
small (ES=+0.06). The randomized evaluations (n=8) had a weighted mean effect size of +0.05. 
These findings support Kulik’s (2003) conclusion that effects of computer-assisted instruction in 
reading are minimal. 

None of the three categories of instructional technology programs had convincing 
positive effects. Across 25 studies of supplemental programs (such as Jostens and CCC), the 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven 
Reform in Education (CDDRE) under funding from the Institute of Education Sciences, U.S. Department of Education. 



weighted mean effect size was +0.05. Two studies of Accelerated Reader had a mean effect size 
of + 0.06. Effect sizes were higher but samples were small in two studies of Fast ForWord, 
which had a mean effect size of +0.21, and a small study of Lightspan had an effect size of 
+0.42. 

It is important to note that there is no trend toward more positive effects of IT in more 
recent studies. Among 1 1 studies reported since 2000, the weighted mean effect size was only 
+0.06, and the large, randomized study by Dynarski et al. (2007; Campuzano et ah, 2009) found 
no significant effects of use of a variety of modem software on the reading achievement of fourth 
graders (ES=+0.02). Most of the IT studies involved use of computers as supplements to regular 
classroom instruction, usually for about 30 minutes, one to three times a week. It may be that 
more intensive uses of IT would produce more robust effects, and the study of My Reading 
Coach, which provided computerized instruction 45 minutes every day and showed positive 
effects (ES=+0.24) in a large randomized evaluation, is a hint in this direction. Another 
promising use of technology is in integrated computer and non-computer instruction, as done in 
Read 180, successfully evaluated in the middle grades (Slavin et al., 2008). However, the 
evidence summarized here clearly indicates that the types of supplementary computer-assisted 
instruction programs that have dominated the use of technology in education for thirty years are 
not producing significant effects in upper-elementary reading. Many studies of IT are of high 
quality and many of them involve large samples. It is difficult to imagine that such a large 
number of studies would fail to detect a meaningful impact if it existed. 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven 
Reform in Education (CDDRE) under funding from the Institute of Education Sciences, U.S. Department of Education. 



Research on Upper Elementary Instructional Process Programs 



Instructional process programs are methods that focus on providing teachers with 
extensive professional development to implement specific instructional methods. In upper 
elementary reading, instructional process programs are quite diverse. Thirty-three studies, six of 
which used random assignment, evaluated a broad range of approaches. Cooperative learning 
programs (Slavin, 1995, 2009; Webb, 2008) use methods in which students work in small groups 
to help one another master academic content. 

Strategy instruction programs teach students cognitive and metacognitive skills such as 
summarization, graphic organizers, and prediction to help them comprehend text. Strategy 
instruction is often combined with other methods, especially cooperative learning and peer 
tutoring. Structured phonetic intervention programs are approaches emphasizing phonics, 
systematic instruction, and frequent assessment of student progress. Phonics-focused 
professional development programs are ones that teach teachers the NRP elements, especially 
phonics and phonemic awareness, mostly in workshops. Integrated language arts programs are 
less structured and less phonetic, and focus on integrating reading and writing, literature study, 
and pleasure in reading. Cross-age tutoring programs involve older children working with 
younger ones, and same-age tutoring involves having children take turns tutoring one another. 
Classroom management and motivation programs focus on building a positive learning 
environment. 

Descriptions and outcomes of all studies of upper elementary instructional process 
programs meeting the inclusion criteria appear in Table 8. 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven 
Reform in Education (CDDRE) under funding from the Institute of Education Sciences, U.S. Department of Education. 



TABLE 8 



Both the methods and the findings of instructional process programs for upper- 
elementary reading were quite diverse. Across 33 experimental-control comparisons, involving 
more than 17,000 students, the weighted mean effect size was +0.21. These include four 
randomized and two RQE studies. 

Ten of the studies evaluated two forms of cooperative learning. These had a weighted 
mean effect size of +0.21. All but one of the cooperative learning studies evaluated Cooperative 
Integrated Reading and Composition (CIRC), which involves students in well-structured 
cooperative groups within which they help each other master and apply metacognitive learning 
strategies. CIRC was the basis for middle school reading programs called Student Team Reading 
and The Reading Edge, which had a weighted mean effect size of +0.29 in four secondary 
studies. The consistent positive effects of this family of cooperative learning approaches support 
the idea that programs focusing on professional development in structured activities that engage 
children in discussions about reading, giving them opportunities to help each other learn and use 
metacognitive skills, may have particular promise for enhancing reading achievement from the 
second grade onward. Positive effects were also found for cross-age tutoring programs 
(ES=+0.26 in 4 studies) and for same-age tutoring (ES=+0.26 in 2 studies), reinforcing the 
conclusion that structuring interaction among students on reading strategies is an effective 
approach. Another promising category was programs emphasizing metacognitive strategy 
instruction, such as Reciprocal Teaching and Thinking Maps, which had a weighted mean effect 



30 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven 
Reform in Education (CDDRE) under funding from the Institute of Education Sciences, U.S. Department of Education. 



size of +0.32 in 5 studies. In these programs, students were taught skills such as prediction, 
summarization, and self-evaluation. 

It is important to note that additional instructional process programs also showed positive 
effects, but because the studies evaluating these approaches involved small groups of struggling 
readers rather than students in general, these findings are reviewed by Slavin et al. (2009). These 
include DISTAR/ Corrective Reading, PALS, and Empower Reading. 

Overall Patterns of Outcomes: Upper Elementary Reading 
Across all categories, there were 79 qualifying studies of upper-elementary school 
reading programs involving a total of more than 32,000 students, of which 23 used random 
assignment (16 were fully randomized and 7 were randomized quasi- experiments). The overall 
sample size-weighted mean effect size was +0.13. The mean effect sizes of +0.06 for reading 
curricula and +0.06 for technology contrast with a mean of +0.21 for instructional process 
programs, such as cooperative learning and strategy instruction, reinforcing the findings of the 
beginning reading review. 



Outcomes for High Poverty Schools 

An important question for policy and practice is whether effects of various programs are 
particularly strong or weak for students in high-poverty schools. To examine this question, 
schools in each study were defined as ‘high-poverty’ if at least 50% of their students qualified 
for free or reduced-price lunches, or if other information in the study (such as a description of 
schools as serving high-poverty neighborhoods) indicated high poverty status. Forty-one 
beginning reading and thirty-one of the upper-elementary studies involved high-poverty schools, 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven 
Reform in Education (CDDRE) under funding from the Institute of Education Sciences, U.S. Department of Education. 



by this definition. At beginning and upper-elementary grade levels, outcomes were very similar 
for high-poverty schools (mean ES=+0.15) and low-poverty schools (mean ES=+0.14). Among 
the studies of reading curricula, weighted mean effect sizes were +0.07 (n=14) for high-poverty 
schools and +0.09 (n=8) for low-poverty schools. For IT, the weighted mean effect sizes were 
+0.08 (n=17) for high-poverty schools and +0.06 (n=26) for low-poverty schools. Among 
studies of instructional process programs, including beginning reading programs that combine 
instructional process and curriculum, the weighted mean effect sizes were +0.27 (n=45) for high- 
poverty schools and +0.20 (n=31) for low-poverty schools. 

As in the overall set of studies, the studies of high-poverty schools supported the 
observation that programs that provide extensive professional development to teachers in 
specific classroom strategies are most likely to make a difference in the achievement of students 
in high-poverty schools. From a policy perspective, what these findings imply is that proven 
models could be used effectively in any type of school, but in order to reduce gaps according to 
socioeconomic status, these programs should be particularly encouraged among high-poverty 
Title I schools. 



Overall Discussion 

The research reviewed in this article provides reason for optimism about the 
improvement of basic reading instruction in the elementary grades. Sixty-three studies of 
beginning reading programs and 79 studies of upper-elementary reading programs met stringent 
methodological requirements, and these studies provide support for many replicable approaches. 
More research on a larger set of programs is needed, of course, but the research that already 
exists provides educators and policy makers with several robust approaches they could choose to 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven 
Reform in Education (CDDRE) under funding from the Institute of Education Sciences, U.S. Department of Education. 



improve their students’ reading performance. Those programs have been shown to be effective in 
high-poverty as well as less disadvantaged schools, so if the effective programs were 
implemented with integrity by many schools serving disadvantaged students, this could 
significantly reduce achievement gaps between middle class and lower class children. The 
research also identified types of approaches that have not been successful in improving 
elementary reading performance. 

There are several important patterns in the findings that are worthy of note. First, for both 
beginning reading and upper-elementary reading, this article finds extensive evidence supporting 
forms of cooperative learning in which students work in small groups to help one another master 
reading skills, and in which the success of the team depends on the individual learning of each 
team member. In beginning reading, examples of cooperative learning included PALS, and 
cooperative learning is a key component of Success for All. In upper-elementary reading, the 
category is primarily represented by Cooperative Integrated Reading and Composition (CIRC). 
Positive effects for studies of cross-age and same-age tutoring at all grade levels also reinforce 
the value of engaging students in structured peer-to-peer interactions. The finding of positive 
effects of cooperative learning programs is consistent with the findings of reviews of secondary 
reading programs (Slavin, Cheung, Groff, & Lake, 2008) and elementary and secondary math 
programs (Slavin, Lake, & Groff, in press; Slavin & Lake, 2008). 

Also consistent with previous reviews is the finding that both alternative curricula and 
instructional technology generally produced small effects on reading measures at all grade levels. 
In particular, the evidence did not support the idea that simply introducing materials or training 
with a strong emphasis on phonics will significantly improve reading outcomes. Effects of 
adopting phonetic textbooks were very small, and a large study of LETRS, a professional 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven 
Reform in Education (CDDRE) under funding from the Institute of Education Sciences, U.S. Department of Education. 



development program focused on phonics, also found disappointing results (Garet et al., 2008). 
These findings suggest that while phonics appears necessary in reading instruction, adding a 
phonics focus is not enough to increase reading achievement. 

The findings of this review add to a growing body of evidence to the effect that what 
matters for student achievement are approaches that fundamentally change what teachers and 
students do together every day. These programs are characterized by extensive professional 
development in classroom strategies intended to maximize students’ participation and 
engagement, give them effective metacognitive strategies for comprehending text, and strengthen 
their phonics skills. As in earlier reviews, such strategies had outcomes that were clearly and 
consistently more positive than those found for curricula or IT alone. These positive effects were 
found equally for high-poverty and low-poverty schools, and they were found on comprehension 
as well as decoding measures. More research and development of reading programs for 
elementary students is clearly needed, but this review identifies several promising approaches 
that could be used today to help students succeed in reading in the elementary grades. 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven 
Reform in Education (CDDRE) under funding from the Institute of Education Sciences, U.S. Department of Education. 



References 



Abram, S.L. (1984). The effect of computer assisted instruction on first grade phonics and 
mathematics achievement computation. Unpublished doctoral dissertation, Northern 
Arizona University. 

Adams, M.J. (1990). Beginning to read: Thinking and learning about print. Cambridge, MA: 
MIT Press. 

Alifrangis, C.M. (1991). An integrated learning system in an elementary school: Implementation, 
attitudes, and results. Journal of Computing in Childhood Education, 2(3), 51-66. 

Apthorp, H. (2005). Elements of Reading: Phonics and Phonemic Awareness. Orlando: Harcourt. 

Apthorp, H. (December, 2005b). A study of the effects of Harcourt Achieve' s Elements of 

Reading: Vocabulary. Denver, CO: Mid-Continent Research for Education and Learning. 

Apthorp, H. (November, 2005). A study of the effects of Harcourt Achieve' s Elements of 

Reading: Fluency. Denver, CO: Mid-Continent Research for Education and Learning. 

August, D., & Shanahan, T. (2006). Synthesis: Instruction and professional development. In D. 
August & T. Shanahan (Eds.), Developing literacy in second-language learners (pp. 351- 
364). Mahwah, NJ: Erlbaum. 

Baer, J., Baldi, S., Ayotte, K., Green, P.J., & McGrath, D. (2007). The reading literacy ofU.S. 
fourth-grade students I an international context: Results from the 2001 and 2006 
Progress in International Reading Literacy Study (PIRLS). Washington, DC: National 
Center for Education Statistics, U.S. Department of Education. 

Barnett, L. B. (2006). The effect of computer-assisted instruction on the reading skills of 
emergent readers. Unpublished doctoral dissertation, Florida Atlantic University. 

Barrett, T.J. (1995). A comparison of two approaches to first grade phonics instruction in the 

Riverside Unified School District. Paper presented at the annual meeting of the California 
Educational Research Association, Lake Tahoe. 

Beasley, N. (1989). The effects of IBM Writing to Read program on the achievement of selected 
first grade students. Dissertation Abstracts International, 51 (3), 739A. (UMI No 
9122247). 

Becker, H.J. (1994). Mindless or mindful use of integrated learning systems. International 
Journal of Educational Research, 27(1), 65-79. 

Birch, J. (2002). The effects of the Delaware Challenge Grant Program on the standardized 
reading and mathematics test scores of second and third grade students in the Caesar 
Rodney School District. Unpublished doctoral dissertation, Wilmington College. 

Blachman, B.A., Tangel, D., Bah, E., Black, R., & McGraw, C. (1999). Developing 

phonological awareness and word recognition skills: A two-year intervention with low- 
income, inner-city children. Reading and Writing: An Interdisciplinary Journal, 11, 239- 
273. 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven 
Reform in Education (CDDRE) under funding from the Institute of Education Sciences, U.S. Department of Education. 



Blackmon, C.M. (2008). The efficacy of Foundations & Frameworks on elementary students' 
reading achievement in urban Christian schools. Unpublished doctoral dissertation, 
Liberty University. 

Bond, C., Ross, S.M., Smith, L.J., & Nunnery, J.A. (1995). The effects of the Sing, Spell, Read, 
and Write program on reading achievement of beginning readers. Reading Research and 
Instruction, 35, 122-141. 

Borman, G.D., & Dowling, N.M. (2007). Student and teacher outcomes of the Superkids quasi- 
experimental study. Madison, WI: University of Wisconsin. 

Borman, G. D., Dowling, N. M., & Schneck, C. (2007). The national randomized field trial of 
Open Court Reading. Madison, WI: University of Wisconsin. 

Borman, G. D., & Hewes, G. (2003). Long-term effects and cost effectiveness of Success for 
Ah. Educational Evaluation and Policy Analysis, 24 (2), 243-266. 

Borman, G.D., Hewes, G.M., Overman, L.T., & Brown, S. (2003) Comprehensive school 

reform and achievement: A meta- analysis. Review of Educational Research, 73 ( 2), 125- 
230. 

Borman, G. D., & Rachuba, L. T. (2001). Evaluation of the Scientific Learning Corporation's 
Fast ForWord computer-based training program in the Baltimore City Public Schools. 
Report prepared for the Abell Foundation, August. 

Borman, G.D., Slavin, R.E., Cheung, A., Chamberlain, A., Madden, N.A., & Chambers, B. 

(2007). Final reading outcomes of the national randomized field trial of Success for Ah. 
American Educational Research Journal, 44 (3), 701-731. 

Bramlett, R. K. (1994). Implementing cooperative learning: A field study evaluating issues for 
school-based consultants. Journal of School Psychology, 32 (1), 67-84. 

Brown, I.S., & Felton, R.H. (1990). Effects of instruction on beginning reading skills in children 
at risk for reading disability. Reading and Writing. An Interdisciplinary Journal, 2, 223- 
241. 

Bryg, V. (1984). The effect of computer assisted instruction upon reading achievement with 
selected fourth grade children. Unpublished doctoral dissertation, University of 
Nebraska. 

Calderon, M., Hertz-Lazarowitz, R., & Slavin, R.E. (1998). Effects of bilingual cooperative 
integrated reading and composition on students making the transition from Spanish to 
English reading. The Elementary School Journal, 99(2), 153-165. 

Calhoon, M., A1 Otaiba, S., Cihak, D., King, A., & Avalos, A. (2007). The effects of a peer- 
mediated program on reading skill acquisition for two-way bilingual first-grade 
classrooms. Learning Disability Quarterly, 30(3), 169-184. 

Calhoon, M., Otaiba, S., Greenberg, D., King, A., & Avalos, A (2006). Improving reading skills 
in predominately Hispanic Title I first grade classrooms: The promise of Peer- Assisted 
Learning Strategies. Learning Disabilities Research and Practice, 21 (4), 261-272. 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven 
Reform in Education (CDDRE) under funding from the Institute of Education Sciences, U.S. Department of Education. 



Campbell, C., & Brigman, G. (2005). Closing the achievement gap: A structured approach to 
group counseling. The Journal for Specialists in Group Work, 30(1), 67-82. 

Campbell, J.P. (2000). A comparison of computerized and traditional instruction in the area of 
elementary reading. Unpublished doctoral dissertation, University of Alabama.] 

Campuzano, L., Dynarski, M., Agodini, R., & Rail, K. (2009). Effectiveness of reading and 
mathematics software products: Findings from two student cohorts. Washington, DC: 

U.S. Department of Education. 

Carbo, M. (1982). Carbo Reading Styles Inventory, New York: Learning Research Associates. 

Carbo, M., Dunn, R., & Dunn, K. (1986). Teaching students to read through their individual 
learning styles. New York : Allyn & Bacon. 

Carlson, C.D., & Francis, D.J. (2002). Increasing the reading achievement of at-risk children 
through Direct Instruction: Evaluation of the Rodeo Institute for Teacher Excellence 
(RITE). Journal of Education for Students Placed at Risk, 7 (2), 141-166. 

Carrick, L.U. (2000). The effects of Reader ’s Theater on fluency and comprehension of fifth 

grade students in regular classes. Unpublished doctoral dissertation, Lehigh University. 

Casey, J., Smith., L., & Ross, S. (1994). Final report: 1993-94 Success for All program in 
Memphis, Tennessee: Formative evaluation of new SFA schools. Memphis, TN: 
University of Memphis, Center for Research in Educational Policy. 

Cassady, J., & Smith, L. (2005). The impact of a structured integrated learning system on first 
grade students' reading gains. Reading and Writing Quarterly, 21(4), 361-376. 

Chall, J.S. (1983). Literacy: Trends and explanations. Educational Researcher, 12 (9), 3-8. 

Chambers, B., Cheung, A., Madden, N., Slavin, R. E., & Gifford, R., (2006). Achievement 
effects of embedded multimedia in a Success for All reading program. Journal of 
Educational Psychology, 98 (1), 232-237 . 

Chambers, B., Slavin, R. E., Madden, N. A., Abrami, P.C., Tucker, B. J. Cheung, A., & Gifford, 
R. (2008). Technology infusion in success for All: Reading outcomes for first graders. 
Elementary School Journal, 109, ( 1), 1-15. 

Chambers, B., Slavin, R.E., Madden, N.A., Cheung, A., & Gifford, R., (2004). Enhancing 
Success for All for Hispanic students: Effects on beginning reading achievement. 
(Tech.Rep.). Baltimore: Johns Hopkins University, Center for Date-Driven Reform in 
Education. 

Chambers, B., Slavin, R. E., Madden, N. A., Cheung, A., & Gifford, R. (2005). Effects of 

Success for All with embedded video on the beginning reading achievement of Hispanic 
children. Technical Report. Center for Research and Reform in Education, Johns Hopkins 
University. 

Chambers, E. A. (2003). Efficacy of educational technology in elementary and secondary 

classrooms: A meta-analysis of the research literature from 1992-2002. Unpublished 
doctoral dissertation, Southern Illinois University at Carbondale. 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven 
Reform in Education (CDDRE) under funding from the Institute of Education Sciences, U.S. Department of Education. 



Chambers, E. A. (2003). Efficacy of educational technology in elementary and secondary 

classrooms: A meta-analysis of the research literature from 1992-2002. Unpublished 
doctoral dissertation, Southern Illinois University at Carbondale. 

Cheung, A., & Slavin, R.E. (2005). Effective reading programs for English language learners 
and other language minority students. Bilingual Research Journal, 29 (2), 241-267. 

Clariana, R.B. (1994). The effects of an integrated learning system on third graders ’ 

mathematics and reading achievement. San Diego, CA: Jostens Learning Corporation. 
(ERIC Document Reproduction Service No. ED 409 181). 

Clayton, I.L. (1992). The relationship between computer-assisted instruction in reading and 
mathematics achievement and selected student variables. Unpublished doctoral 
dissertation, The University of Southern Mississippi. 

Cohen, K. (1991). A comparative study of reading instruction management for selected third- 
grade students in an urban school district. Unpublished doctoral dissertation, University of 
North Texas. 

Collis, B., Ollila, L., & Ollila, K. (1990). Writing to Read: An evaluation of a Canadian 

installation of a computer-supported initial language environment. Journal of Educational 
Computing Research, 6(4), 411-427. 

Comprehensive School Reform Quality Center (2006). CSRQ center report on elementary 

comprehensive school reform models. Washington, DC: American Institutes for Research. 

Conner, J.M., Greene, B.G., & Munroe, K. (2004). An experimental study of the instructional 
effectiveness of the Harcourt Reading Program in academically at-risk schools in the 
Philadelphia City School District. Institute of America Bloomington, Indiana: Educational 
Research. 

Cook, T., Shadish, W.R., & Wong, V.C. (2008). Three conditions under which experiments and 
observational studies produce comparable causal estimates: New findings from within 
study comparisons. Paper presented at the annual meetings of The Society for Research 
on Effective Education, Crystal City, VA. 

Coomes, P. (1985). The effects of computer assisted instruction on the development of reading 
and language skills. Unpublished doctoral dissertation, North Texas State University. 

Cooper, H. (1998). Synthesizing research (3 rd ed.). Thousand Oaks, CA: Sage. 

Cooperman, K.S. (1985). An experimental study to compare the effectiveness of a regular 
classroom reading program to a regular classroom reading program with a computer- 
assisted instruction program in reading comprehension skills in grades two through four. 
Unpublished doctoral dissertation, The American University. 

Correnti, R. (2009). Examining CSR program effects on student achievement: Causal 

explanation through examination of implemen tation rates and studen t mobility. Paper 
presented at the 2nd annual conference of the Society for Research on Educational 
Effectiveness, Washington, DC, March, 2009. 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven 
Reform in Education (CDDRE) under funding from the Institute of Education Sciences, U.S. Department of Education. 



D’Agostino, J. (2009). The effectiveness of the Superkids on student achievement and teacher 
outcomes. Columbus, OH: Ohio State University. 

Deshler, D., Palincsar, A., Biancarosa, G., & Nair, M. (2007). Informed choices for struggling 
adolescent readers. Newark, DE: International Reading Association. 

Dianda, M., & Flaherty, J. (1995, April). Effects of Success for All on the reading achievement 
of first graders in California bilingual programs. Paper presented at the annual meeting 
of the American Educational Research Association, San Francisco. 

Dynarski, M., Agodini, R., Heaviside, S., Novak, T., Carey, N., Campuzano, L., Means, B., 
Murphy, R., Penuel, W., Javitz, H., Emery, D., & Sussex, W. (2007). Effectiveness of 
reading and mathematics software products: Findings from the first student cohort. 
Washington, DC: Institute of Education Sciences. 

Easterling, B. (1982). The effects of computer assisted instruction as a supplement to classroom 
instruction in reading comprehension and arithmetic. Unpublished doctoral dissertation, 
North Texas State University. 

Erdner, R., Guy, R., & Bush, A. (1997). The impact of a year of computer assisted instruction on 
the development of first grade reading skills. Journal of Educational Computing 
Research, 18 (4), 369-388. 

Estep, S. (1997). An investigation of the relationship between integrated learning systems and 
academic achievement. Unpublished doctoral dissertation, Purdue University. 

Foorman, B. R., Francis, D. J., Fletcher, J. M., Schatschneider, C., & Mehta, P. (1998). The role 
of instruction in learning to read: Preventing reading failure in at-risk children. Journal of 
Educational Psychology, 90, 37-55. 

Frechtling, J., Zhang, X., & Silverstein, G. (2006). The Voyager Universal Literacy System: 

Results from a study of kindergarten students in inner-city schools. Journal of Education 
for Students Placed at Risk, 77(1), 75-95. 

Freiberg, H.J., Prokosch, N., Tresister, E.S., & Stein, T. (1990). Turning around five at-risk 

elementary schools. Journal of School Effectiveness and School Improvement, 7(1), 5-25. 

Fuchs, D., Fuchs, L., Mathes, G., & Simmons, D. (1997). Peer-Assisted Learning Strategies: 
Making classrooms more responsive to diversity. American Educational Research 
Journal, 34 (1), 174-206. 

Fuchs, D., Fuchs, S., Thompson, A., Al-Otaiba, S., Yen, L., Yang, N., Braun, M., & O’Connor, 
R.. (2001). Is reading important in reading-readiness programs? A randomized field trial 
with teachers as program implementers. Journal of Educational Psychology, 93 (2), 25 1 . 

Fuchs, L., Fuchs, D., Kazdan,S., & Allen, S. (1999). Effects of Peer-Assisted Learning Strategies 
in reading with and without training in elaborated help giving. Elementary School 
Journal, 99(3), 201-220. 

Gamse, B.C., Tepper-Jacob, R., Horst, M., Boulay, B., & Unlu, F. (2008). Reading First impact 
study: Final report. Washington, DC: Institute for Education Sciences, U.S. Department 
of Education. 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven 
Reform in Education (CDDRE) under funding from the Institute of Education Sciences, U.S. Department of Education. 



Garet, M., Cronen, S., Eaton, M., Kurki, A., Ludwig, M., Jones, W., et al. (2008). The impact of 
two professional development interventions on early reading instruction and 
achievement. New York: MDRC. 

Granick, L., & Reid, E. (1987). Writing to Read program, FY 87. Baltimore: Baltimore City 
Public Schools. 

Grant, E.M. (1973). A study of comparison of two reading programs (Ginn 360 and DISTAR) 
upon primary inner city students. Unpublished doctoral dissertation, University of 
Washington. 

Greenwood, C. R., Terry, B., Utley, C. A., Montagna, D., & Walker, D. (1993). Achievement, 
placement, and services: Middle school benefits of Classwide Peer Tutoring used at the 
elementary level. School Psychology Review, 22(3), 497-516. 

Greenwood, C.R., Delquadri, J.C., & Hall, R.V. (1989). Longitudinal effects of Classwide Peer 
Tutoring. Journal of Educational Psychology, 81 (3), 371-383. 

Hecht, S. & Close, L. (2002). Emergent literacy skills and training time uniquely predict 

variability in responses to phonemic awareness training in disadvantaged kindergartners. 
Journal of Experimental Child Psychology, 82 \ 93-1 15. 

Hecht, S. (2003). A study between Voyager and control schools in Orange County, Florida 2002- 
2003. Davie, FL: Florida Atlantic University. 

Herman, R. (1999). An educator’s guide to schoolwide reform. Arlington, VA: Educational 
Research Service. 

Hickey, K. (2006). An examination of student performance in reading/language and 

mathematics after two years of Thinking Maps® implementation in three Tennessee 
schools. Unpublished doctoral dissertation, East Tennessee State University. 

Hilger, L. (2000). Cross-age tutoring in reading: Academic and attitudinal effects from high- 
school tutors and third-grade tutees. Unpublished doctoral dissertation, University of 
Minnesota. 

Hoffman, J.T. (1984). Reading achievement and attitude toward reading of elementary students 
receiving supplementary computer assisted instruction compared with students receiving 
supplementary traditional instruction. Unpublished doctoral dissertation, Ball State 
University. 

Huxley, A. (2006). A text-based intervention of reading fluency, comprehension, and content 
knowledge. Unpublished doctoral dissertation, University of Michigan. 

Jenkins, J., Jewell, M., Leicester, N., O'Connor, R., Jenkins, L. & Troutner, N. (1994). 

Accommodations for individual differences without classroom groups: An experiment in 
school restructuring. Exceptional Children, 60(4), 344-358. 

Jones, E.M., Gottfredson, G.D., & Gottfredson, D.C. (1997). Success for some: An evaluation of 
the Success for All program. Evaluation Review, 21 (6), 643-670. 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven 
Reform in Education (CDDRE) under funding from the Institute of Education Sciences, U.S. Department of Education. 



Jones, L.R.G. (1995). The effects of an eclectic approach versus a modified whole language 

approach on the reading and writing skills of first-grade students. Unpublished doctoral 
dissertation, The University of Mississippi. 

Joshi, R.M„ Dahlgren, M., & Boulware-Gooden, R. (2002). Teaching reading in an inner city 
school through a multisensory teaching approach. Annals of Dyslexia 52 (1), 229. 

Juel, C. (1988). Learning to read and write: A longitudinal study of 54 children from first 
through fourth grades. Journal of Educational Psychology, 80, 437-447. 

Kadel Research Consulting. (2006). Garfield Heights City Schools Maple Leaf Intermediate, 

Enhancing Education Through Technology, End-year evaluation. Hyde Park, VT: Kadel 
Research Consulting. 

Kennedy, M. M. (1978). Findings from the Follow Through Planned Variation study. 
Educational Researcher, 7(6), 3-11. 

Knox, M. (1996). An experimental study of the effects of The Accelerated Reader program and a 
teacher directed program on reading comprehension and vocabulary of fourth and fifth 
grade students. Dissertation Abstracts International, 57 (TO), 4208A (UMI No. 

9710798). 

Kuhn, M.R., Schwanenflugel, P.J., Morris, R.D., Morrow, L.M., Woo, D.G., Meisinger, E.B., 
Sevick, R.A., Bradley, B.A., & Stahl, S.A. (2006). Teaching children to be fluent and 
automatic readers. Journal of Literacy Research, 38 (4), 357-387. 

Kulik, J. A. (2003). Effects of using instructional technology in elementary and secondary 
schools: What controlled evaluation studies say. SRI Project Number PI 0446.001. 
Arlington, VA: SRI International. 

Leary, S.F. (1999). The effect of Thinking Maps® instruction on the achievement of fourth-grade 
students. Unpublished doctoral dissertation, Virginia Polytechnic Institute and State 
University. 

Levy, M.H. (1985). An evaluation of computer assisted instruction upon the achievement of fifth 
grade students as measured by standardized tests. Unpublished Doctoral Dissertation, 
University of Bridgeport. 

Lie, A. (1991). Effects of a training program for stimulating skills in word analysis in first-grade 
children. Reading Research Quarterly, 26, 234-250. 

Lindsey, M.M. (1998). A comprehensive evaluation of an integrated reading and language arts 
curriculum, with attention to the experiences of low achieving children. Unpublished 
doctoral dissertation, University of Oregon. 

Lipsey, M. W., & Wilson, D. B. (2001). Practical meta-analysis. Thousand Oaks, CA: Sage. 

Livingston, M. & Flaherty, J. (1997). Effects of Success for All on reading achievement in 
California schools. Los Alamitos, CA: WestEd. 

Lundberg, I., Frost, J., & Petersen, O. (1988). Effects of an extensive program for stimulating 

phonological awareness in preschool children. Reading Research Quarterly, 23, 263-284. 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven 
Reform in Education (CDDRE) under funding from the Institute of Education Sciences, U.S. Department of Education. 



Mac Iver, M., Kemper, E., & Stringfield, S. (2003). The Baltimore Curriculum Project: Final 
report of the four-year evaluation study. Baltimore, MD: Johns Hopkins University, 
Center for Social Organization of Schools. 

Macaruso, P., Hook, P.E., & McCabe, R. (2006). The efficacy of computer-based supplementary 
phonics programs for advancing reading skills in at-risk elementary students. Journal of 
Research in Reading, 29, 162-172. 

Madden, N.A., Slavin, R.E., Karweit, N.L., Dolan, L.J., & Wasik, B.A. (1993). Success for All: 
Longitudinal effects of a restructuring program for inner-city elementary schools. 
American Educational Research Journal, 30, 123-148. 

Marion, G.G. (2004). An examination of the relationship between students ’ use of the Fast 
ForWord Reading Program and their performance on standardized assessments in 
elementary schools. Unpublished doctoral dissertation, East Tennessee State University. 

Mathes, P., & Babyak, A. (2001). The effects of Peer- Assisted Literacy Strategies for first-grade 
readers with and without additional mini-skills lessons. Learning Disabilities Research & 
Practice, 16 (1), 28-44. 

Mathes, P., Howard, J., Allen, S., & Fuchs, D. (1998). Peer-assisted Learning Strategies for First- 
grade Readers: Responding to the Needs of Diverse Learners. Reading Research 
Quarterly, 33, 62-94. 

Mathes, P., Torgesen, J., & Allor, J. (2001). The effects of Peer- Assisted Literacy Strategies for 
first-grade readers with and without additional computer-assisted instruction in 
phonological awareness. American Educational Research Journal, 38 (2), 371-410. 

Mathes, P., Torgesen, J.., Clancy-Menchetti, J, Santi, K., Nicholas, K, & Robinson, C., et al. 
(2003). A comparison of teacher-directed versus peer-assisted instruction to struggling 
first-grade readers. The Elementary School Journal, 103(5), 461-479. 

Merzenich, M., Jenkins, W., Johnston, P., Schreiner, C., Miller, S., & Tallal, P. (1996). Temporal 
processing deficits of language learning impaired children ameliorated by training. 
Science, 271, 77-80. 

Miller, H. (1997). Quantitative analyses of student outcome measures. International Journal of 
Educational Research, 25, 119-136. 

Morrow, L.M. (1992). The impact of a literature-based program on literacy achievement, use of 
literature, & attitudes of children from minority backgrounds. Reading Research 
Quarterly, 27, 250-275. 

Moss, M., Fountain, A.R., Boulay, B., Horst, M., Rodger, C., & Brown-Lyons, M. (2008). 

Reading First implementation evaluation: Final report. Cambridge, MA: Abt Associates. 

Munoz, M.A. & Dossett, D. (2004). Educating students placed at risk: Evaluating the impact of 
Success for All in urban settings. Journal of Education for Students Placed at Risk, 9(3), 
261-277. 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven 
Reform in Education (CDDRE) under funding from the Institute of Education Sciences, U.S. Department of Education. 



Murphy, R., Penuel, W., Means, B., Korbak, C., Whaley, A., & Allen, J. (2002). E-DESK: A 
review of recent evidence on discrete educational software. Menlo Park, CA: SRI 
International. 

Mys, D.P. & Petrie, J. (1988). Evaluation of student reading and math WICAT computer 

managed instructional program Salina Elementary School November, 1985 - June 1988. 
Bulletin No 1345, Office of Research and Evaluation, Public Schools, Bearhorn, MI. 

National Center for Education Statistics. (2007). The National Assessment of Educational 
Progress. Washington, DC: US Department of Education. 

National Early Literacy Panel (2008). Developing early literacy. Washington, DC: National 
Institute for Literacy. 

National Reading Panel (2000). Teaching children to read: An evidence-based assessment of the 
scientific research literature on reading and its implications for reading instruction. 
Rockville, MD: National Institute of Child Health and Human Development. 

Nelson, J. R., & Stage S. A. (2007). Fostering the development of vocabulary knowledge and 
reading comprehension through contextually based supplemental multiple meaning 
vocabulary instruction. Education and Treatment of Children, 30(1), 1-22. 

Nunnery, J., Slavin, R.E., Ross, S.M., Smith, L.J., Hunter, P., & Stubbs, J. (1996, April). An 
assessment of Success for All program component configuration effects on the reading 
achievement of at-risk first grade students. Paper presented at the annual meeting of the 
American Educational Research Association, New York, NY. 

O’Connor, R. (1999). Teachers learning Ladders to Literacy. Learning Disabilities Research & 
Practice, 14(4), 203-214. 

Oglesby, F., & Suter, W. N. (1995). Matching reading styles and reading instruction. Research in 
the Schools (Mid-South Educational Research Association), 2(1), 11-15. 

Opuni, K.A. (2006). The effectiveness of the Consistency Management & Cooperative Discipline 
(CMCD) model as a student empowerment and achievement enhancer: The experiences 
of two K-12 inner-city school systems. Paper presented at the 4th Annual Hawaii 
International Conference of Education, Honolulu, Hawaii. 

Papalwis, R. (2004). Struggling middle school readers: Successful, accelerating intervention. 
Sacramento, CA: California State University. 

Paterson, W., Henry, J., O’Quin, K., Ceprano, M., & Blue, E. (2003). Investigating the 

effectiveness of an integrated learning system on early emergent readers. Reading 
Research Quarterly, 38(2), 172-206. 

Phillips, L., Norris, S., Mason, J. & Kerr, B. (1990). Effect of early literacy intervention on 
kindergarten achievement (Tech. Rep. No. 520). Champaign: University of Illinois at 
Urbana-Champaign, Center for the Study of Reading. 

Policy Studies Associates (2007). Evidence of long-term learning outcomes among Reading 
Together tutees. Washington, DC: Author. 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven 
Reform in Education (CDDRE) under funding from the Institute of Education Sciences, U.S. Department of Education. 



Pressley, M. (1998). Reading instruction that works: The case for balanced teaching. New York: 
Guilford Press. 

Ragosta, M. (1983). Computer-assisted instruction and compensatory education: A longitudinal 
analysis. Machine-Mediated Learning, 7(1)97-127. 

Rapp, J. C. (1991). The effect of cooperative learning on selected student variables (Cooperative 
Integrated Reading and Composition) on academic achievement in reading 
comprehension, vocabulary and spelling and on student self-esteem. Ed.D. dissertation, 
Washington State University, — Washington. Retrieved September 5, 2007, from 
ProQuest Digital Dissertations database. (Publication No. AAT 9207225). 

Reid, E. (1996). Exemplary Center for Reading Instruction (ECRI) validation study. Salt Lake 
City, UT: Exemplary Center for Reading. (ERIC No. ED 414560). 

Reis, S.M., Eckert, R.D., McCoach, D.B., Jacobs, J.K., & Coyne, M. (2008). Using enrichment 
reading practices to increase reading fluency, comprehension, and attitudes. Journal of 
Educational Research, 101 (5), 299-314. 

Renzulli, J.S., & Reis, S.M. (1998). The schoolwide enrichment model: A how-to guide for 
educational excellence (2 nd ed.). Mansfield Center, CT: Creative Learning Press. 

Resendez, M., Sridiharan, S., & Azin, M. (2006). Harcourt Achieve's Elements of Reading: 
Comprehension randomized control trial. Jackson, WY: PRES Associates. 

Rimm-Kaufman, S., Fan, X., Chiu, Y., & You, W. (2007). The contribution of the Responsive 
Classroom Approach on children's academic achievement: Results from a three year 
longitudinal study. Journal of School Psychology, 45, 401-421. 

RMC Research Corporation (2003). The Literacy Center K-l Las Vegas research project. Las 
Vegas, NV: Author. 

Rohrbeck, C. A., Ginsburg-Block, M. D., Fantuzzo, J. W., & Miller, T. R. (2003). Peer-assisted 
learning interventions with elementary school students: A meta-analytic review. Journal 
of Educational Psychology, 94 (2), 240-257. 

Rose, J. (2006). Independent review of the teaching of early reading. London: Department for 
Education and Skills. 

Ross, S.M., & Casey, J. (1998a). Longitudinal study of student literacy achievement in different 
Title I school-wide programs in Ft. Wayne community schools, year 2: First grade 
results. Memphis: University of Memphis, Center for Research in Educational Policy. 

Ross, S.M., & Casey, J. (1998b). Success for All evaluation, 1997-98 Tigard-Tualatin School 
District. Memphis: University of Memphis, Center for Research on Educational Policy. 

Ross, S.M., Nunnery, J.A., & Smith, L.J. (1996). Evaluation of Title I reading programs: 

Amphitheater Public Schools Year 1: 1995-1996. Memphis, TN: University of Memphis, 
Center for Research in Educational Policy. 

Ross, S.M., Smith., L., Bond, C., & Casey, J. (1994). Final report: 1993-1994 Success for All 
program in Montgomery, Alabama. Memphis, TN: Center for Research in Education 
Policy, University of Memphis. 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven 
Reform in Education (CDDRE) under funding from the Institute of Education Sciences, U.S. Department of Education. 



Ross, S.M., Smith, L., & Casey, J. (1997). Final report: 1996-97 Success for All program in 

Clarke County, Georgia. Memphis, TN: University of Memphis, Center for Research on 
Educational Policy. 

Ross, S.M., Smith, L., & Casey, J. (1997b). Preventing early school failure: Impacts of Success 
for all nonstandardized test outcomes, minority group performance, and school 
effectiveness. Journal of Education for Students Placed at Risk, 2 (1), 29-53. 

Ross, S.M., Smith, L.J., & Casey, J. (1992). Final report: 1991-92 Success for All program in 
Caldwell, Idaho. Memphis, TN: Memphis State University. 

Ross, S.M., Smith, L.J., & Casey, J. (1995). Final Report: 1994-95 Success for All program in 
Fort Wayne, Indiana. Memphis: University of Memphis, Center for Research in 
Educational Policy. 

Ross, S.M., Smith, L.J., & Casey, J.P. (1997). Preventing early school failure: Impacts of 

Success for All on standardized test outcomes, minority group performance, and school 
effectiveness. Journal of Education for Students Placed at Risk, 2, (1), 29-53. 

Roth, S. & Beck, I. (1987). Theoretical and instructional implications of the assessment of two 

microcomputer word recognition programs. Reading Research Quarterly, 22(2), 197-218. 

Rothstein, H.R., Sutton, A.J., & Borenstein, M. (Eds.) (2005). Publication bias in meta-analysis: 
Prevention assessment, and adjustments. Chichester, UK: John Wiley. 

Saracho, O. (1982). The effects of a computer-assisted computer program on basic skills 

achievement and attitudes toward instruction of Spanish-speaking migrant children. 
American Educational Research Journal, 19(2), 201-219. 

Scarcelli, S., & Morgan, R. (1999). The efficacy of using a direct reading instruction approach in 
literature based classrooms. Reading Improvement, 36 (4), 172-179. 

Schmidt, S. (1991). Technology for the 21st century: The effects of an integrated distributive 
computer network system on student achievement. Unpublished doctoral dissertation, 
University of La Veme, LaVerne, CA. 

Schneider, W., Kuspert, P., Roth, E. Vise, M., & Marx, H. (1997). Short- and long-term effects 
of training phonological awareness in kindergarten: Evidence from two German studies. 
Journal of Experimental Child Psychology, 66, 311-340. 

Schultz, L. (1996). Effectiveness study of Scholastic Phonics Readers and a comprehensive 
reading program. New York: Scholastic. 

Scientific Learning Corporation. (2006). Improved reading skills by students in Boone County 
School District who used Fast ForWord products. Maps for Learning: Educator Reports, 
10 (15), 1-7. 

Sedlmeier, P., & Gigerenzer, G. (1989). Do studies of statistical power have an effect on the 
power of studies? Psychological Bulletin, 105, 309-316. 

Shadish, W.R., Cook, T.D., & Campbell, D.T. (2002). Experimental and quasi-experimental 
designs for generalized causal inference. Boston: Houghton-Mifflin. 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven 
Reform in Education (CDDRE) under funding from the Institute of Education Sciences, U.S. Department of Education. 



Shapiro, L.R. & Solity, J. (2008). Delivering phonological and phonics training within whole 
class teaching. The British Journal of Educational Psychology, 78 (4), 597-620. 

Sivin-Kachala, J., & Bialo, E. (2005). Fluency Formula second grade study. Long Island, NY, 
New York: scholastic. 

Skeans, S. (1991). The effects of Cooperative Integrated Reading and Composition, fidelity of 
implementation, and teacher concerns on student achievement. Ph.D. dissertation, Texas 
A&M University. Retrieved September 5, 2007, from ProQuest Digital Dissertations 
database. (Publication No. AAT 9217026). 

Skindrud, K., & Gersten, R. (2006). An evaluation of two contrasting approaches for improving 
reading achievement in a large urban district. The Elementary School Journal, 106 (5), 
389-407. 

Slavin, R. E. (1986). Best-evidence synthesis: An alternative to meta-analytic and traditional 
reviews. Educational Researcher, 15, (9), 5-11. 

Slavin, R.E. (1995). Cooperative learning: Theory, research, and 'practice (2nd Ed.). Boston: 
Allyn & Bacon. 

Slavin, R. (2008). What works? Issues in synthesizing education program evaluations. 

Educational Researcher, 37 (1), 5-14. 

Slavin, R.E. (2009). Cooperative learning. In G. McCulloch & D. Crook (Eds.) International 
Encyclopedia of Education. Abington, UK: Routledge. 

Slavin, R.E., Cheung, A., Groff, C., & Lake, C. (2008). Effective reading programs for middle 
and high schools: A best evidence synthesis. Reading Research Quarterly, 43 (3), 290- 
322. 

Slavin, R.E., & Lake, C. (2008). Effective programs in elementary mathematics: A best evidence 
synthesis. Review of Educational Research, 78 (3), 427-515. 

Slavin, R. E., Lake, C., Davis, S. & Madden, N. A. (2009, April). Effective Programs for 

struggling readers: A best-evidence synthesis. Paper presented at the annual meetings of 
the American Educational Research Association, San Diego, CA. 

Slavin, R.E., Lake, C., & Groff, C. (in press). Effective programs in middle and high school 
math: A best evidence synthesis. Review of Educational Research. 

Slavin, R.E., & Madden, N.A. (1991). Success for All at Buckingham Elementary: Second year 
evaluation. Baltimore, MD: Johns Hopkins University, Center for Research on Effective 
Schooling for Disadvantaged Students. 

Slavin, R.E., & Madden, N. (1998). Success for All/Exito Para Todos: Effects on the reading 
achievement of students acquiring English. Report No. 19. Baltimore, MD: Center for 
Research on the Education of Students Placed at Risk. 

Slavin, R.E., & Madden, N. A. (in press). Measures inherent to treatments in systematic reviews 
in education. Journal of Research on Educational Effectiveness. 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven 
Reform in Education (CDDRE) under funding from the Institute of Education Sciences, U.S. Department of Education. 



Slavin, R. E., Madden, N. A., Chambers, B. & Haxby, B. (2009). Two million children: Success 
for All. Thousand Oaks, CA: Corwin. 

Slavin, R.E., Madden, N.A., Dolan, L.J., & Wasik, B.A. (1993). Success for All in the Baltimore 
City Public Schools: Year 6 report. Baltimore, MD: Johns Hopkins University, Center for 
Research on Effective Schooling for Disadvantaged Students. 

Slavin, R.E., & Smith, D. (in press). Effects of sample size on effect size in systematic reviews in 
education. Educational Evaluation and Policy Analysis. 

Smith, L., & Ross, S. (1992). 1991-1992 Ft Wayne, IN SFA results. Memphis, TN: Memphis 
State University, Center for Research in Educational Policy. 

Smith, L.J., Ross, S.M., & Casey, J.P. (1994). Special education analyses for Success for All in 
four cities. Memphis: University of Memphis, Center for Research in Educational 
Policy 

Snow, C.E., Burns, S.M., & Griffin, P. (Eds.) (1998). Preventing reading difficulties in young 
children. Washington, DC: National Academy Press. 

Snow, M.F. (1993) The effects of computer-assisted instruction and focused tutorial services on 
the achievement of marginal learners. Ed.D. dissertation, University of Miami. 

Retrieved September 5, 2007, from ProQuest Digital Dissertations database. (Publication 
No. AAT 9401831). 

Social Programs that Work (2008). Success for All. Retrieved 12/12/08 from 
www . evidencebas edpro grams . org . 

Sporer, N., Brunstein, J., & Kieschke, U. (2009). Improving students’ reading comprehension 
skills: Effects of strategy instruction and reciprocal teaching. Learning and Instruction, 

19, 272-286. 

Stambaugh, T. (2007). Effects of the Jacob's Ladder reading comprehension program on 
reading comprehension and critical thinking skills of third, fourth, and fifth grade 
students in rural, Title I schools. Unpublished doctoral dissertation, The College of 
William and Mary. 

Standish, D. (1995). The effects on reading comprehension of Jostens' Integrated Language Arts 
for second-grade students along with Jostens' Basic Learning System for second-grade 
Chapter 1 students . Dissertation Abstracts International, 57 (3), 1079A. (UMINo. 
9623238). 

Stebbins, L.B., St. Pierre, R.G., Proper, E.C., Anderson, R.B., & Cerva, T.R. (1976). Education 
as experimentation: A planned variation model, Volumes IIIA and IIIB. Cambridge, MA: 
Abt Associates. (ERIC No. ED 148489). 

Stein, M., Berends, M. Fuchs, D., McMaster, K., Saenz, L., Yen, L., Fuchs, L., & Comption, D. 
(2008). Scaling up an early reading program: Relationships among teacher support, 
fidelity of implementation, and student performance across different sites and years. 
Educational Evaluation and Policy Analysis, 30 (4), 368-388. 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven 
Reform in Education (CDDRE) under funding from the Institute of Education Sciences, U.S. Department of Education. 



Stevens, R.J., & Slavin, R.E. (1995a). Effects of a cooperative approach in reading and writing 
on academically handicapped and nonhandicapped students. The Elementary School 
Journal, 95 (3), 241-262. 

Stevens, R.J. & Slavin, R.E. (1995b). The cooperative elementary school: Effect on student 

achievement and social relations. American Educational Research Journal, 32, 321-351. 

Stevens, R.J., Madden, N.A., Slavin, R.E., & Famish, A.M. (1987). Cooperative Integrated 

Reading and Composition: Two field experiments. Reading Research Quarterly, 22, 433- 
454. 

Stevens, R. J., Van Meter, P., Gamer, J., Warcholak, N., Bochna, C., & Hall, T. (2008). The 
Reading and Integrated Literacy Strategies (RAILS): An integrated approach to early 
reading. Journal of Education for Students Placed at Risk, 13 (4), 357-380. 

Stevenson, Z., Cathey-Pugh, J., & Kosmidis, M. (1988). Achievement in the Writing to Read 

program: A comparative evaluation study. Washington, DC: District of Columbia Public 
Schools, Division of Quality Assurance and Management Planning. (ERIC No. 
ED293147) 

Swartz, J., & Johnston, K. (2003). Efficacy study of Houghton Mifflin Reading: A legacy of 
literacy. Cambridge, MA: Abt Associates. 

Tallal, P., Miller, S., Bedi, G., Byma, G., Wang, X., Nagarajan, S. Shchreiner, C., Jenkins, W., 
Merzenich, M. (1996). Language comprehension in language-learning impaired children 
improved with acoustically modified speech. Science, 271, 81-84. 

Texas Center for Educational Research (2007). Evaluation of the Texas Technology Immersion 
Pilot: Findings from the second year. Austin, TX: Author. 

Torgerson, C. (2006). The quality of systematic reviews of effectiveness in literacy learning in 
English: A tertiary review. Journal of Research in Reading, 29_ (2), 1-29. 

Torgerson, C. J., Brooks, G., & Hall, J. (2006). A systematic review of the research literature on 
the use of phonics in the teaching of reading and spelling (DfES Research Rep. 711). 
Department for Education and Skills, London. 

Tracey, D. & Young, J. (2006). Technology and early literacy: The impact of an integrated 

learning system on high-risk kindergartners' achievement. Union, NJ: Kean University. 

Van Keer, H. & Verhaeghe, J. (2005). Comparing two teacher development programs for 

innovating reading comprehension instmction with regard to teachers’ experiences and 
student outcomes. Teaching and Teacher Education, 21, 543-562. 

Van Keer, H. & Verhaeghe, J. (2008). Strategic reading in peer tutoring dyads in second- and 
fifth-grade classrooms. Unpublished report. Ghent University, Belgium. 

Vaughan, J., Serido, J., & Wilhelm, M. (2006). The effects of My Reading Coach on reading 

achievement of elementary education students. Phoenix, AZ: Arizona Board of Regents. 

Wang, L.W. & Ross, S.M. (1999a). Evaluation of Success for All program. Little Rock School 
District, year 2: 1998-99. Memphis: University of Memphis, Center for Research in 
Educational Policy. 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven 
Reform in Education (CDDRE) under funding from the Institute of Education Sciences, U.S. Department of Education. 



Wang, L.W. & Ross, S.M. (1999b). Results for Success for All Program, Alhambra (AZ) School 
District. Memphis: University of Memphis, Center for Research in Educational Policy. 

Wasik, B., & Slavin, R. (1993). Success for All at Pepper hill Elementary School: 1993 
evaluation. Baltimore, MD: Center for Research on Effective Schooling for 
Disadvantaged Students. Johns Hopkins University. 

Webb, N. M. (2008). Learning in small groups. In T. L. Good (Ed.), 21 st Century Education: A 
Reference Handbook (pp. 203-211). Los Angeles: Sage. 

What Works Clearinghouse (2009). Beginning reading topic report. Washington, DC: U.S. 

Department of Education. Retrieved March 15, 2009, from http://ies.ed.gov/ncee/wwc. 

Whitaker, J.C. (2005). Impact of an integrated learning system on reading and mathematics 
achievement. Unpublished doctoral dissertation, Tennessee State University. 

White, R. N., Haslam, M. B., Hewes, G. M. (2006). Improving student literacy in the Phoenix 
Union High School District, 2003-04 and 2004-05. Final Report. Washington, DC: 
Policy Studies Associates. 

Wilkerson, S.B. (2004). A study of the effectiveness of Harcourt Achieve ’s Rigby Literacy 

Program: Final evaluation report. Aurora, CO: Mid-Continent Research for Education 
and Learning. 

Wilkerson, S.B., Shannon, L.C., & Herman, T.L. (2006). An efficacy study on Scott Foresman's 
Reading Street Program: Year one report. Magnolia Consulting. 

Wilkerson, S.B., Shannon, L.C., & Herman, T.L. (2006). An efficacy study on Scott Foresman's 
Reading Street Program: Year one report. Austin, Texas: Magnolia Consulting. 

Wilkerson, S.B., Shannon, L.C., & Herman, T.L. (2007). An efficacy study on Scott Foresman's 
Reading Street Program: Year two report. Magnolia Consulting. 

Williams, B.J. (2005). A quasi-experimental study on the effects of the OpenBook to Literacy 
program on fourth-grade students. Unpublished doctoral dissertation, Tennessee State 
University. 

Yee, V.N. (2007). An evaluation of the impact of a standards-based intervention on the 

academic achievement of English language learners. Unpublished doctoral dissertation, 
University of Southern California. 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven 
Reform in Education (CDDRE) under funding from the Institute of Education Sciences, U.S. Department of Education. 



T able 1: Beginning Reading Currituii 


Study 


Design 
Large Small 


Duration 


N 


Grade 


Sample 

Characteristics 


Pasttest 


Effect Sizes by 
S ubgr o up "S leisure 


Decoding 


Comprehension 


Overall 
ESkt She 


Core Ba.al Proeramc 


Open Court Re ad me 




Randomized 

Quasi-Experment 

CL) 




lScb3ses 




Schools in Idaho, 


Terns Nova 










Borman. Dowling & Sclmeck 


1 year 


(9E. X) 


i 


Texas. Florida, and 
Indiana. 
61 a oFL. 


Reading Comprehension 


-0.0 6 




-0.06 


-0.17 


(2007) 


307 students 


Reading Voc ahulary 


-022 








(165C, 139C) 




57*t> minority 


Reading Composite 


-0.17 








Reading Street 


Wtltersai. Shannon. & Herman 
(2007) 


Randomized 

Quea-Extseriment 

(L) 


1 year 


18 teachers 
387 srudents 
(220E. 167C) 


i 


Schools in 4 sies 
around the US. 
26%FL.S6%W, 8»iH. 
3%AA 


Gates MacGtnitie 




- 


-0.15 


-0.15 


Wiltersan. Shannon. & Herman 
(2006) 


Randomized 

Quad-Experiment 

<L) 


1 year 


Id teachers 
(8E. 8Q 


i 


5 schools in 2 urban, 
1 rural sie. 
54%FL,57%W, 
25% AA, 11%H 


Gates MscGir.itie 




- 


-0.02 


-0.02 


Sc hobs tic Phonic s Reader, and 


ate racr Place 








4 districts 




Large ushan school 


CIBS 










Schultz (1900) 


Quaa-Exceriment 

<L) 


1 year 


8 classes 


i 


Readine 


-0.07 


-0.23 


-0.14 


-0.16 


301 srudents 


dissricsinCA 


Vocabulary 


-0.11 






(162E. 139C) 






Comprehension 


-021 




















Word Analysts 


-023 








Supplemental Curricub 


Open Court Phonics Kit 


Barrett (1995) 


Matched 


1 year 


9 classes 
(5E. 4C) 


i 


Middle class district in 


TER4.-2 


-036 


-0.54 


-0.47 


-0.49 


(S) 


161 students 
(THE, 83C) 


Riverside, CA 


SATTbtal 


-0.62 


Phonics in Context 


Barrett (1995) 


Matched 


1 year 


11 classes 
(7E,4Q 
170 srudents 
(S"E, 83C) 


i 


Middle class district ir. 


TER4.-2 


-021 


-0.43 


-0.40 


-0.34 


(S) 


Riverside, CA 


SATTbtal 


-0.47 


Element ofReadire: Phonics an 


d Phonemic Aware 

Randomized 


.ness 


6 schools 
16 teachers 


i 


4 high-poverty. 

2 middle class schools. 


ERDA 


-0.09 


-0.09 


-029 


-0.19 


Apthorp (2005) 


Quasi-Exnertment 

(L> 


1 year 


y8£. 8C; 
257 srudents 
(126E, 131C) 


Overall, 51% H_, 
564oAA, 41HW, 5°oH 


Gates MacGinitie 


-039 



Note: L=laree studywithat least 25C students; Somali study with Isss than 250 students; E=£xperimental; CCantrol: CIBS=€omprehen=ive Test of Basic Skills; SAT=Scholasic Achievement Tkst;TER4.^Te3t of Early Reading 
Ability; ERDA=£arly Reading Diasiostic Assessement, FL=Eree reduced- price lunch; W=Wiie; AA=A5rican American; H^iissanic. 



50 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven Reform in Education (CDDRE) under funding from the 
Institute of Education Sciences, U.S. Department of Education. 



T able 2: Instructional T echnology in Beginning Reading 


Study 


Design 
Large Small 


Duration 


N 


Grade 


Sample 

Characteristics 


Post test 


Effect Sites by 
Subgroup Measure 


Decoding 


Comprehension 


Os* rail 
Effect Size 


Computer -Assisted Instruction 


Destination Reading 


Campuzano et tl. (2009) 


Randomized (L) 


1 year 


21 teachers 
(2 IE. 140 
742 students 
(448E.294C) 


1 


Schools across the U.S. 
71% FL, 31% AA, 
34%H,34%W 


SAT-10 




- 


-0.11 


-0.11 


Hea cfcprout 






















Campuzano et al. (2009) 


Randomized (L) 


1 year 


63 teachers 
(32E, 31Q 
1,079 srjdents 
(574E, 5050 


1 


Schools across the U.S. 
35% FL, 81% W, 13% 
AA, 67% H 


SAT-10 




- 


-0.01 


-0.01 


Plato Focus 






















Campuzano et al. (2009) 


Randomized (L) 


1 year 


29 teachers 
(15E, 14Q 
618 students 

(32^,2910 


1 


Schools across the U.S. 
48% FL, 67%W, 27% 
H, 5%AA 


SAT-10 




- 


-0.03 


-0.03 


IVaterfcrd Earlv Reading Program 


Campuzano et al. (2009) 


Randomized (L) 


1 year 


46 teachers 
(2SE, 180 
1,155 students 
(689E, 46 60 


1 


Schools across the U.S. 
47%FL, 3"%AA, 
16%H 


SAT-10 




- 


-0.02 


-0.02 


Cassidy A- Smith (2005) 


Msthed(S) 


1 year 


6 classes 
(3E, 3C) 
93 students 
(46E. 470 


1 


School in rural midsvest 


Terra Nova Reading 




- 


-0.71 


-0.71 


Phonic s B ase d Re ading 


Ma caruso, Hook. & McCabe 
(2006) 


Matched (S) 


7 mo. 


5 schools 
10 classes 
(5E, 5Q 
1 79 students 
(92 E. 87 Cl 


1 


Boston area 
50% FL 


Gates MacGinitie 




- 


-020 


-020 


The Literacy Center (LeapF 


ogl 


RMC (2004) 


Randomized 

Quasi- 

Experiment (S) 


I year 


6 schools 
195 students 
(109E, 86C) 


1 


High-poverty schools in 
Las Vegas, 

30% ELL 


Gates MacGinitie 


-0.04 


-0.01 


-0.04 


-0.02 


DIBELS 


-0.01 



51 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven Reform in Education (CDDRE) under funding from the 
Institute of Education Sciences, U.S. Department of Education. 



Study 


Design 
Large Small 


Duration 


N 


Grade 


Sample 

Characteristics 


Posttest 


Effect Sires by 
Subgroup Measure 


Decod Kg 


Comprehension 


Ore rail 
Effect Siie 


WICAT 


Erdner. Guy, & Bush (1997) 


Mathed(S) 


1 year 


2 schools 
85 students 


i 


Schools in north central 
OK 


CIBS 




- 


-1.05 


-1.05 


Readme Machine 


Air am (1984) 


Randomized (S) 


12 weeks 


103 students 


i 


Not stated 


itbs 




- 


-0.19 


-0.19 


Mind-Method Models 


Writing to Read 








97 students 
(53E, 44C) 




Schools in Britidt 
Columbia, Canada 


SAT 










ColHs, Ollih & Olilla (1990) 


Matched (S) 


1 year 


i 


Total Reading 


-0.47 


- 


-0.47 


-027 










Word Study 


-0.07 




















SE SAT-2 






















Sounds & Letters 


-0.09 














-- 




Middle-class students 


Word Readme 


-0.15 








Beasley (19S9) 


Matched (S) 


6 months 


(42E, 32 C) 


i 


in Athens, AL; 


Sentence Readme 


-0.44 


-0.13 


-0.52 


-0.27 










82%W, 18%AA 


Reading 

Comprehension 


-052 




















Total Readme 


-0.44 








Embedded Multimedia 


Readme Reels 














Woodcock 
















10 schools 
3 94 students 




High-poverty schools in 


Word ID 


-0.15 








B. Chambers etal.(2006) 


Randomized (L) 


1 year 


i 


Hartford, CT 


Word Attack 


-032 


-0.20 


-0.08 


-0.17 










61% H. 35% AA 


Psssaee Como. 


-0.08 




















DIBELS 


-0.12 




















Woodcock 
















2 schools 
159 students 
(75E, 84 C) 




Hispanic srudents in 


Letter- Word 


-033 








B. Chambers etal.(2008) 


Randomized (S) 


1 year 


i 


high-poverty schools in 


Word Attack 


-028 


-0.30 


-0.17 




Los Angeles and Las 


GOKT 














Vezas 


Fluency 


-028 




















Comorehensicn 


-0.17 









Note: L=kree study svilh at least 250 students; S^small study with kss than 25 C students; E=Experimental; C=Control; SAT-9=Senford Achievement Test Sth Edition; TO WRE=Test of Word Reading Efficiency; 
CTBS=Comprehensive Test of Basic Skills; ITBS=Iowa Test of Basic Skills; SAT=Scholastic Achievement list; SESAT=Stsnford Early School Achievement Test: GORT=Gray Oral Reading Test; FL^ree reduced-price 
lunch: W-Whie: AA=African American: H=Hisoanic: ELL=EneHsh laneuaee banter. 



52 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven Reform in Education (CDDRE) under funding from the 
Institute of Education Sciences, U.S. Department of Education. 



T able 3: Instructional Process Programs in Beginning Reading 


Study 


Design 
Large Small 


Duration 


N 


Grade 


Sample 

Character htks 


Posttest 


Effect Size by 
Subgr oups Measure 


Decoding 


t 

I 

O 


Overall 
Effect Siie 


Cooperative Learning Programs 


Classwide Peer Tutorins (CW 


I3Q. 




Randomised 

Quasi- 




6 schools 
(3E. 3Q 






MAT 










Greenwood et al. (1989) 


4 sears 


1-4 (same 
studens) 


High-poverty schools 
in Kansas City. KS 


Grade 4 


-057 


- 


-0.57 


-0.57 




Experiment (S) 




123 studens 


Grade 6 
(2 year follcwntf 


-055 








PALS 




Randomized 

Quasi- 




20 classes 






Woodcock 










Mathes & Bsir.ak (2001) 


14 weeks 


(10E. IOC) 


1 


Schools in Florida 


Word Identification 


-051 


-0.72 


-0.41 


-0.61 


63%W. 36WAA 


Word Attack 


-092 




Experiment (S) 




(61E. 49C) 




Passage Comprehension 


-0.41 










Randomized 

Quasi- 

Experiment (S) 




3 schools 




Snide ns taught in 
English in a majority- 


DIBELS 










Calhoon et al. (2006) 


20 weeks 


6dassrooms 
78 studenE 


1 


Hispanic school in 
KM 


Nonsense Word Fluency 


-0.58 


-0.29 


- 


-o 29 






(41E. 3? C) 




*5% FL, 32%W. 
68%H 


Oral Reading Fluency 


-0.00 














3 schools 
6 classrooms 
76 stidenE 




Srudents in border 


DIBELS 












Randomised 






schools in 2-way 


Nonsense Word Fluency 


-0.51 








Callioon et at. (2007) 


Qiasi- 

Experiment (S) 


Id weeks 


1 


bilingual program; 

: : : FL. 79%H, 
21?i W, 28% ELL 


Letter Naming Fluency 


-0.20 


-0.33 


“ 


-033 






(43E . 33 C) 




Oral Reading Fluency 


-0.29 














24 classes 
(12E. 12C) 
140 srudens 
(84E. 56C) 






Woodcock 


















Schools in the 


Word Identification 


-0.39 








(2001)' 


Matched (S) 


16 weeks 


1 


southeast: 


Word Asack 


-0.59 


-0.49 


-0.56 


-050 








65%W. 32%AA 


Passage Comprehension 


-0.56 


















TERA-2 


-0.48 














20chsses 
(10E, 10C) 
96 studens 
(48E. 48C) 




Schools in 
southeastern city 


Woodcock 










Mathes etal. (1998) 


Matched(S) 


16 weeks 


1 


Word Identification 


-0.21 


-0.38 


-0.37 


-037 






Word Attack 


-0.54 


















Psssaze Comprehension 


-0.37 








Ph 


onoloeical At 


rareness Training Programs 








10 schools 






Norwegian Reading Test 


Lie (1991) 


Randomized 

Quasi- 

Experiment (S) 




208 srudens 
(Sequential analysis: 


1-2 


Schools in Halden. 


End of grade 1 


-034 










52 studens 
Positional analysis: 
60 studens 
Consol: 96 students) 


Norway 


End of grade 2 


-030 




-0.30 


-0.30 



53 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven Reform in Education (CDDRE) under funding from the 
Institute of Education Sciences, U.S. Department of Education. 



Study 


Design 
Large Small 


Duration 


N 


Grade 


Sample 

Character htks 


Posttest 


E fleet Sue by 
Subgroups Measure 


Decoding 


Comprehension 


Overall 
Effect Sire 


Lundberg. Frost. & Petersen 


Matched (L) 


3 \ears 


39C studene 


K-2 


Schools in rural 


End of grade 1 


-0.40 




— D540.48 


-0.48 


(1958) 


(23 5E. 155C) 


Denmark 


End of grade 2 


-0.48 










23 classes 

(he. no 






German Rea dins Test 








Schneider. Kuapert. Roth. Vise. 


Matched (I) 


3 >»ars 


K-2 


Schools in rural 


End of grade 1 


-029 




-0.19 


-0.19 


& Marx (1997) (Study 1) 


371 studene 
(205E. 166Q 


Germany 


End of grade 2 


-0.19 




Schneider. Kuapert. Roth. Vise. 
& Marx (1991 (Study 2) 






18 classes 
(HE, 7C) 
345 studene 




Schools in rural 
Germany 


German Reading Test 










Matched (L) 


3 xears 


K-2 


End of grade 1 


-053 


-- 


-0.33 


-033 






(191E. 155Q 




End of grade 2 


-033 














4 schools 






Woodcock Word ID 


-028 












1 1,2 years 
11 weeks inK-1. 
1 year in 1st 
grade 


(2E,20 






Decodins of Real Words 


-0.64 












128 students 




High-poverty schools 
in Syracuse. NY 


Decodin’ of Non-Words 


-0'4 








Blachman « al. (1999) 


Matched (S) 


(66E.62C); 


K-l 


1 tear follow- u® 




-033 


- 


-0.33 






One year follow-up 




Woodcock Word ID 


-031 












106 students 






Decodin’ of Real Wards 


-034 














(58 E. 48 0 






Decoding of Non-Words 


-036 








Phanics-F ocused Professional Development Modeh 


Sins. Spell. Rad. Writ* 


Jones (1995) 


Matched (S) 


7 months 


4 classes 
97 stidene 
(50E. 47C) 


1 


School in 
Appalachian 
Mississippi: 
55%FL 78%W, 
22%AA 


Gaes MacGinitfe Reading 
Comprehension 




-- 


-0.21 


-0.21 


Earlv Readme Research ( ERR 





Shapiro & Soli$*(2008) 


Matched (S) 




12 schools 
(6E.6Q 


K-l 


Schools in England 


British Achievement Scabs 

Word Readme 

NFER 


-0.62 






-0.54 


‘ ! ' ln 


434 snsdents 


Word Readme 


-052 












(235E. 1990 






Accuracy 


-059 




















Comprehension 


-0.41 








Readme and Integrated Liters 


tv Strategies <RA 


ILS1 








3 schools 




Schools in small city 
in PA. 71% FL. 
94%W 


M4.T 










Secens et al. (2008) 


Matched(S) 


2 years 


(2E. IQ 


K-l 


K-l 


-039 




-0.41 


-0.41 


23' studene 
(112E, 1250 


1-2 


1-2 


-0.43 





54 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven Reform in Education (CDDRE) under funding from the 
Institute of Education Sciences, U.S. Department of Education. 



Study 


Design 
Large Small 


Duration 


N 


Grade 


Sample 

Characteristics 


Posttest 


Effect Sue by 
Subgroups Measure 


Decodin? 


Comprehension 


Overall 
Effect Siie 


Ladders to Literacy 


O'Connor (1999) 


Mitclied(S) 


1 tear 


4 schools 
(2E,2C> 
105 srudents 
(64E. 410 


K-l 


Laras urban district. 
4d%AA. 51% W 


Woodcock Letter- Word ID 


-092 


-0.20 


- 


-0.20 


Woodcock Letter- Word ID 
(l->»ar folkwrop) 


-0.02 


Woodcock Word Attack 
(l-yearfollois-up) 


-038 


Ortou- Gillinekam 


Joshi et si. (2002) 


Matched(S) 


1 year 


4 schools 
56 students 
(24E, 32C) 


1 


High-poverty sdiooh 
in the SoudKwest 
81% FL. 

53% minority 


Woodcock Word Attack 


-028 


-0.28 


-0.58 


-0.43 


GMKT 


-038 


Other Professional Deve lonne at M ode h 


Four Blocks 


Scifcelli & Morgan ( 1999) 


Matched (9 


1 tear 


55 studene 
<25 E, 30 C) 
in 4 classes 
(2C,2E) 


1 


Tide I school in 
Virginia Beach. VA 


GMRT 




- 


-0.56 


-0.56 



Note: L=large study will it least 250 students; S=strt£ll study trith teas than 250 students: E=£xperimer.:-1: OControl; MAT^MetropoHtsn Achievement Test TERA=Testof Early Reading Ability TOWRE=TestofWoed 
Reeding Efficiency; DORT=Durrell Oral Reading Te sc CRIRT=<jses-hi£cGtnitte Reading Test; FL=Eree reduced-price lunch; W=SMiite; AA=African American H=Hispanic. 



55 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven Reform in Education (CDDRE) under funding from the 
Institute of Education Sciences, U.S. Department of Education. 



Table 4: Curriculum + Instructional Process Programs in Beginning Reading 


Study 


Design 
Large Small 


Duration 


N 


Grade 


Sample 

Characteristics 


Posttest 


Effect Sizes by 
Subgroup 
Measure 


Decoding 


Comp rehenzion 


O era 11 
Effect 
Size 


Success for All 


Borman at al. (2007) 


Randomized (L) 


3 yaan 


35 schools 
210S students 
(10S5E, 1023 C) 


K-2 


Title I schools 
throughout die U.S.. 
72%FL, 57% AA, 
31% W, 10% H 


Woodcock 




i 

k> 

oo 


-0.21 


-0.25 


Word Identification 


-0.22 


Word Attack 


-0.33 


Passage C omprehension 


-0.21 


Corranti (2009) 


Matched (L) 


4 years 


115 schools 
(3QE, 85C) 
3783 students 
(831E2932C) 


K-3 


High poverty schools 
in 17 states. 

69% FL, 52% A A. 
22%W, 19%H, 6% 
Asian 


Terra Nova 








-0.43 


Madden at al. (1993): 
Slavinetal. (1993) 


Matched (L) 


5 vaars 


10 schools 
(5 E. 5 C) 
1925 studants 
(S90E, 1035 C) 5 
cohorts 
(1st grade in 
experiment I year. 
2nd grade 2 years, 
etc.) 


1-5 


Afh can American 
students in high- 
poverty schools in 
Baltimoie. MD 


Average of Woodcock. 
DORT. and CTBS 




-0.55 


-0.39 


-0.46 


1st grade 


-0.55 


2nd grade 


-0.32 


3rd grade 


-0.49 


CTBS 




4th grade 


-0.45 


5 th grade 


-0.4S 


Nunnary at al. (1996) 


Matchad (L) 


2 yaats 


64 schools 
(46E, 18C) 
1555 students 


1-2 


High-poverty schools 
in Houston. TX 
79%FL, 52%H, 
48%AA 


Average of Woodcock 
and DORT 




-0.09 


-0.02 


-0.05 


First cohort (Gr. 2) 


-0.08 


S econd cohort (Gr. 1) 


-0.09 


Spanish (Gr. 1) 


-0.21 


Livingston & Flahartv(1997) 


Matched (L) 


2 years 


6 schools 

(3 E, 3 C) 

3 cohorts: 
English speakers 

(272E, 1S4C) 

Spanish bilingual 

(87 E, 93 C) 
Other ESL 
(80 E 112 C) 


1,2 


High -poverty 
multilingual schools 
in Modesto and 
Riverside. C A 


Average of Woodcock and 
DORT acro&s cohorts 




-0.49 


-0.49 


-0.49 


Eng lis h-Dominant 


-0.28 


S panis h B ilingu al 


-0.77 


ESL 


-0.43 


Ross et al. (1996) 


Matched (L) 


1 year 


4 schools 
(2E2C) 
540 students 
(169 E, 371 C) 


1 


Mosd y His pa me 
schools in 

Amphitheater District 
near Tucson. A Z 


Average of Woodcock and 
DORT 




-0.62 


-0.33 


-0.47 



56 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven Reform in Education (CDDRE) under funding from the 
Institute of Education Sciences, U.S. Department of Education. 



Study 


Design 
L arse Small 


Duration 


N 


Grade 


Sample 

Characteristics 


Poattezt 


E ffec t Sizes by 
Subgroup 
Measure 


Decoding 


Comp rehenuon 


OreraU 

E ffect 
Size 


Jones et al . (1997) 


Matched (L) 


3 years 


2 schools 
(IE, 1C) 
49S students 
(339E, 159C) 
Cohort 1: 
172 students 
(113E, 59C) 
Cohort 2: 
157 students 
(109E, 4SC) 
Cohort 3: 
169 students 
(117E, 5X) 


3 Cohorts: 
Cohort 1: 
K-3 

Cohort 2: 
K-2 

Cohort 3: 
K-l 


Hizh-poverty A A 
schools in Charles ton, 
SC 


Woodcock 




-023 


-0.02 


-0.27 


Kinder® art an 


-0.98 


Woodcoc k and D ORT 




1st grade 


-0.20 


SATorBSAP 




1st grade 


-0.03 


SAT 




2nd grade 


-0.10 


SAT 




3rd grade 


-0.06 


B . Chambers et al. (3005) 


Matched (L) 


1 year 


S schools 
(4 E, 4C) 
455 students 
(3 HE, 144C) 


K-l 


MosdyHispanic 
communities in the 

US 


WoodcockReading 
Mastery Test 




-020 


-0.21 


-0.20 


Ross. Smith. & C asey (1992) 


Matched (L) 


3 years 


2 schools 

(IE, 1C) 
370 students 
(223E, 147C) 

3 cohorts 


1-3 


Rural schools in 
Caldwell, ID 


Average of Woodcock and 
DORT 




-0.10 


-0.11 


-0.10 


Ross & Casev(199Sb) 


Matched (L) 


2 sears 


8 schools 
(3E, 5C) 
356 students 
(151E, 205C) 


K-l 


High -poverty schools 

in Ft. Wayne, IN; 
75%FL. 45% minority 


Woodcock 




-033 


-0.17 


-0.25 


Word Identi a cation 


-0.22 


Word Attack 


-0.45 


Passaze C omprehension 


-0.14 


Durrell Oral 


-0.21 


Munoz & Dossett (2004) 


Matched (L) 


3 years 


6 schools 

(3 E, 3 C) 
349 students 
(217 E, 132 C) 


K-3 


High-porerty schools 
in Louisville, KY 


CTBS 




- 


-0.15 


-0.15 



57 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven Reform in Education (CDDRE) under funding from the 
Institute of Education Sciences, U.S. Department of Education. 



Study 


Design 
Large Small 


Duration 


N 


Grade 


Sample 

Characteristics 


Posttest 


Effect Sizes by 
Subgroup 
Measure 


Decoding 


Comp rehenzkm 


Or era 11 
Effect 
Size 


Dianda & Flaherty < 1995) 


Matched (L) 


2 yean 


6 schools (3E, 3C) 
319 students 
(131 E. 1SSC) 


1 


Mostly Hispanic 
students in schools in 
California 
72% FL, 42%H, 
34% \V 
32%ELL 


Woodcock 




-0.41 


-0.45 


-0.42 


Letter- Word Identification 


-0.46 


Wotd Attack 


-0.36 


Passage Comprehension 


-0.45 


Woodcock (all three 
measures) 




English speakers 


-0.55 


Spanish bilingual 


-0.84 


Spanish dominant 


-0.82 


Non-English speakers 


^0.11 


Ross & Casey(199Sa) 


Matched (L) 


1 year 


4 schools 

(2 E, 2 C) 
316 students 
(156 E, 160 C) 


1 


Suburban schools in 
Portland, OR 


Average of Woodcock and 
DORT 




0.00 


0.02 


0.01 


Ross. Smith & Casey ( 1997) 


Matched (L) 


2 years 


Cohort 1 : 
135 students 
(94E. 41C) 
Cohort 2: 
146 students 
(106E. 40C) 


K-l 

1-2 


Hiz h-poverty schools 
in Clarke Co., GA 


Average of Woodcock and 
DORT 




-022 


-o.os 


-0.15 


1st grade 


-0.27 


2nd grade 


-0.03 


Ross st al. (1995) 


Matched (L) 


3 yean 


2 schools 

3 cohorts 
251 students 

Cohort 1 : 
59E, 47C 
Cohort 2: 
54E. 20C 
Cohort 3: 
45E. 32C 


K-4 


Tide I schools in F t. 
Wayne, IN 


Average of Woodcock and 
DORT 




-0.09 


0.09 


0.00 


2nd grade 


+0.10 


3rd grade 


-0.10 


4 th grade 


0.00 


Casey etal. (1994) 


Matched (S) 


lyear 


3 schools 

(2 E. 1 C) 
1S9 students 
(116 E, 73 C) 


1 


Hie h-poverty African 
American schools in 
Memphis . TN 


Woodcock 




-0.78 


-0.53 


-0.65 


Word Identification 


-0.52 


Word Attack 


-1.03 


Passage Comprehension 


-0.63 


Durrell Oral Reading 


-0.42 



58 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven Reform in Education (CDDRE) under funding from the 
Institute of Education Sciences, U.S. Department of Education. 



Study 


Design 
Large Small 


Duration 


N 


Grade 


Sample 

Characteristics 


Posttest 


E ffec t Sizes by 
Subgroup 1 
Measure 


Decoding 


Cbmp r eh en.no n 


Overall 

Effect 

Size 


Ro6s. Smith. &Bond (1994) 


Matchad (S) 


2 yean 


Cohort 1: 

4 schools 
133 students 
(65E, 6SC) 
Cohort 2: 

2 schools 
46 students 
(2QE. 26C) 


K-l 

1-2 


African American 
students in high- 
poverty schools in 
Montgomery. AL 


Average of Woodcock and 
DORT 




-0.76 


-0.47 


-0.62 


K-l Cohort 


-0.39 


1-2 Cohort 


+1.15 


Smith et al. (1994) 


Matched (S) 


4 years 


2 schools 
142 students 
(74E, 6SC) 
4 cohorts 


14 


High poverty AA 
school inMerrphis 


Average of Woodcock 
and DORT Gray 




-0.55 


-0.65 


-0.60 


1st trade 


-1.15 


2nd trade 


-o.os 


3rd trade 


-0.56 


4th trade 


-0.04 


Wasik& Slavin (1993) 


Matched (S) 


3 years 


2 schools 
(IB, 1C) 

3 cohorts 


1-3 


High-poverty schools 
in Charleston, SC. 
40*. FL: 60%AA 


Average of Woodcock and 
DORT 




-039 


-0.39 


-0.39 


1st trade 


I 

O 

i J 
o 


2nd trade 


-0.67 


3rd trade 


-0.30 


S lavin & Madden (1991) 


Matched (S) 


2 years 


2 schools 

(1 E. 1 C) 
10S students 
(58 E, 50C) 


1-2 


Small rural t<*rn in 
Mainland 
40*oFL. 503.AA 
50*/.W 


Average of Woodcock and 

DORT 


-0.02 


-0.02 


-0.02 


-0.02 


CTBS 


-0.02 


Wang & Rou (1999a) 


Matched (S) 


lvear 


4 schools 
(2 E. 2 C) 
97 students 
(50 E, 47 C) 


1 


High -poverty schools 
in Little Rock. AK 


Average of Woodcock and 
DORT 




-020 


-0.39 


-0.30 


Wang & Rou ( 1999b) 


Matchad (S) 


1 year 


2 schools 
(1 E. 1 C) 
82 students 
(43 E. 39 C) 


1 


High-poverty mostly 
Hispanic schools in 
Alhambra Distict near 
Phoenix. AZ 


Average of Woodcock and 
DORT 




-0.15 


-0.16 


-0.15 


S lavin & Madden (199S) 


Matched (S) 


3 years 


50 students 
(21E. 29C) 


1-3 


S pani sh-dominant 
LEP students in 
Philadelphia, PA who 
had transitioned to 
English classes 


Woodcock 




-036 


-0.07 


-0.22 


Word Attack 


-0.65 


Word Identification 


-0.06 


Pauage C omprehension 


-0.07 



59 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven Reform in Education (CDDRE) under funding from the 
Institute of Education Sciences, U.S. Department of Education. 



Study 


Design 
Large Small 


Duration 


N 


Grade 


Sample 

Characteristics 


Posttest 


I flee t Sizes by 
Subgroup 
Measure 


Decoding 


Comp rehenzkm 


Or era 11 
Effect 
Size 


Direct Instruction 


Kennedy (1978) 


Matched (L) 


4 yean 


2216 children 
(1161E ; 1055C) 


K-3 


High poverty schools 
ink RL i : & MS 


MAT Reading 
C amprehensian 




- 


-0.07 


-0.07 


Mac Rer et al . (2003) 


Matched (L) 


4 yars 


12 schools 
(6 E. 6 C) 
275 students 
(171 E, 1040) 


K-3 


High-poverty schools 
in Baltimore, majonty 
African-American 


CTBS 




- 


-0.13 


-0.07 


Read ins Comprehension 


-0.13 


V ocabulary 


0.00 


Grant (1973) 


Matched Post 
Hoc (S) 


2 years 


2 schools 
78 students 
(39E. 39C) 


K-l 


High-poverty African 
American students in 
\VI 


Wis consi n R eading S kill 
Development 




-0.84 


- 


-0.84 


Lone Vowels 


-0.64 


Base Words 


-1.33 


Dale Johnson Word 
Recoenition 


-0.54 



Note: L= large study with at least 250 students: S=sraall study with less than 250 students; E= Experimental: C=Control; DORT=Durrell Oral Reading Test CTB S =C oniprehenst re Test of Basic 
Skills: S AT=Scholastic Achievement Test BSAP=Basic Skills Assessment Program: MAT= Metropolitan Achievement Test FL=Free reduced-price lunch; \V=\Vhite; AA= African American; 
H=His panic; ELL=English language learner - 



60 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven Reform in Education (CDDRE) under funding from the 
Institute of Education Sciences, U.S. Department of Education. 



Table 5: Kind erg art en-Onh' Studies 


Study 


Design 

Large'Small 


Duration 


N 


Grade 


Sample Characteristics 


Posttest 


Effect Sizes by 
Sub gro up/M ea s ure 


Overall 
Effect Size 


Reading Curricula 


Superlads 








43 classes 






SAT-10 






Borman & Dowling (2007) 


Xfitched (I) 


1 war 


(23E.20C) 




Schools thorughout the 


Sounds and Letters 


+0.25 




750 students 




U.S., 52% minority 


Word Reading 


+0.14 


+0.20 








(400E, 350C) 






Sentence Reading 


+0.22 










43 classes 
(21E, 22C) 
750 students 
(302E, 36SC) 






UBS 














Schools thorughout the 


Word Analysis 


+0.41 




DAgostino (2009) 


Xfetched (L) 


1 war 


K 


U.S., 47% FL, 


Reading Words 


+0.23 


+0.23 










38% minority’ 


Reading Comprehension 


+0.24 












Vocabulary 


+0.02 




Yovaeer Universal Literacy 


Frechdmg et al. (2006) 


Xfctched a) 


1 war 


8 schools 
(4 E. 4 C) 


Y 


Afric an Americ an 
students in 8 urban 
schools 


Woodcock 




+0.67 


39S students 




Word ID 


+0.21 








(202 E, 196 C) 




Word Attack 


+1.11 










3 schools 






Woodcock 














High-poverty schools in 


Word ID 


-0.10 




Hecht(2003) 


Xfetched(S) 


5 months 


(1 E, 2 C) 


K 


Word Analysis 


+0.10 


-0.02 


213 students 
(101 E. 112 C) 


Orlando 


DIBELS 














Nonsense Word 


-0.07 




Instructional T echnologv 


Waterford Earh' Reading Program 


Paterson et al. (2003) 


\fetched (L) 


1 year 


16 classes 
(8E. 8C) 


K 


High-poverty c ommunt ty 
in we stern New’ York 


Clay Word Recognition Test 




0.00 


Tracev& Young (2006) 


Xhtched (L) 


1 year 


15 classes 
(SE, 1C) 
265 children 
(151E, 114C) 


K 


High-minority’ 
northeastern community’ 


TERA-2 




+0.47 



61 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven Reform in Education (CDDRE) under funding from the 
Institute of Education Sciences, U.S. Department of Education. 



Study 


Design 

Larse'Small 


Duration 


N 


Grade 


Sample Characteristics 


Posttest 


Effect Sizes by 
Subgroup/Measure 


Overall 
Effect Size 


The Literacy Center (L eapFroe) 




Randomized 




6 schools 




High-poverty' schools in 
Us Vegas, 30% ELL 


Gates McGinitie 


+0.17 




RAC (2003) 


Quasi- 

Experiment (S) 


1 year 


25 S students 
(126E, 132C) 


K 


E4BELS 


+0.12 


+0.14 


Destination Reading 








15 classes 
(SE, 7C) 




High-porerty high- 


DIBELS 


-0.56 




Barnett (2000) 


\fetched (L) 


1 year 


K 


minority' communi ty r in 


Clay r Word Recognition Test 


-0.47 


-0.53 










FL 


Dolch 


-0.56 




Wr it ing to Read 


Stevenson et al. (19SS) 


Matched (S) 


1 year 


241 students 
(S6E, 155C) 


K 


African American 
students in Washington, 
DC 


MAT Rea ding 




+0.35 


Granick & Reid (19S7) 


Mitched(S) 


1 year 


2 schools 
73 students 
(37E, 36C) 


K 


High-poverty African 
American schools in 
Baltimore 


MAT 




+0.02 


Instructional Proc 


ess Programs 


Ladders to Literacy 
























S schools 
(4E.4C) 

404 students 
3 groups: 
Ladders only: 
11 teachers, 
136 students; 

Ladders + PALS: 
11 teachers, 
133 students; 

Control: 

11 teachers, 
135 students 






Ladders to LiteracvGroup 
















End of kindergarten 
















Woodcock 
















Word Attack 


+0.17 














Word ID 


-0.25 














Followup to Fall of first grade 








Randomized 

a) 


20 weeks. 




Tide I andnon-Title I 


Word Attack 


+0.38 




Fuchsetal. (2001) 


with a one- 


K 


kindergartens in 


Word ID 


+0.05 


+0.21 




year followup 




Nashville. TN 


Ladders - PAL S Group 
















End of kindergarten 
















Word Attack 


+0.36 














Word ID 


+0.25 














Followup to Fall of first grade 
















Word Attack 


+0.41 
















Word ID 


+0.43 





62 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven Reform in Education (CDDRE) under funding from the 
Institute of Education Sciences, U.S. Department of Education. 



Study 


Design 

Large'Small 


Duration 


N 


Grade 


Sample Char acter is tks 


Posttest 


Effect Sizes by 
Subgroup/Measure 


Overall 
Effect Size 


OConnor(1999) 


Notched (L) 


1 year 


17 classes 
(9E, SC) 
31S students 
(192E, S9C) 


K 


Rural midwestem 
district 100% White 


Woodcock Johnson Letter Word 
ID 




+0.43 


Typical children 


+0.33 


At-risk children 


+0.6S 


Little Books 


Phillips etal. (1990) 


Randomized 

Quasi- 

Experiment (L) 


1 year 


IS classes 
309 students 


K 


Urban and rural schools 
in Newfoundland. 
Canada 


\ET 




+022 


School -home 


+0.33 


School only 


+0.19 


Hone only 


+0.14 



Note: L=iarge study with at least 230 students: S=small study with less than 250 students; E=Experimental; C=ControL ITBS: Iowa Test of Basic Skills; SAT-10: Stanford .Achievement Test; TERA=Test 
of EarlvReading Ability MAT=\fetropolitan Achievement Test FL=Free reduced-price lunch; W=White; AA= African American; H=Hispanic; ELL=English language learner- 



63 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven Reform in Education (CDDRE) under funding from the 
Institute of Education Sciences, U.S. Department of Education. 



Table 6 

Upper E lauentary Readme Curricula 


Study 


Design 
Large Small 


Duration 


N 


Grade 


Sample Characteristic- 


Poettest 


E ffect Size by 
Subgroup 
Measure 


Overall E ffect Si* 


CoreBaeal Program: 


Open Court 








5 schools 




Hizh-povertv schools in 


Terra Nova 






Borman. Dor.' line. & Schneck (2007) 


Randomized (L) 


1 year 


33 teachers 


2-5 


ID, FL, NC, TX 


Comprehension 


-0.15 


-0.15 


(1SE, 15C) 


77%FL 73% minority. 


C onposite 


-0.15 








613 students 




il%ESL 


Vochulary 


-0.13 
















SAT-9 






Skindiud & Gersten. 2006 


Matched (L) 


2 years 


434 students 
(292 E s 142 C) 
Grade 3 cohort: 
642 students 
(350 E. 292 C) 


2-3, 3-4 


High- poverty schools in 
Sacramento 


Grade 2-3 cohort 


-0.30 


-0.20 










Grade 3-4 cohort 


-0.10 




Readme Street 












3 middle class schools; 2 


Gates MacGimhe 






Wilkerson. Shannon. & Harman 
(2006) 


Randomised (L) 


1 year 


5 schools 
32 teachers 


2-3 


Title L high poverty 
schools. 54% FL. 57%W. 


2nd grade 


-0.10 


-0.06 












25%AA, 1 1%H 


3rd grade 


-0.01 




Wdkerscn. Shannon. Sc Harman 






40 taa&ars 




4 schools nadamvide. 


Gates Ma<<iinitk 






Randomised (L) 


1 year 


793 students 


2-3 


86% W, 3%AA. 


2nd grade 


-0.14 


-0.04 


(2007) 


(409E, 3 SC) 




263. FL 


3rd grade 


-0.06 




Houghton Mifflin Readme 








10 schools 
(5E. 5C) 

2 Cohorts: 
Cohort 1: 
5S6 students 
(22QE. 326C) 






FIBS 












Cohort 1: 
Grades 2-3 
Cohort 2: 
Grade 3 




Cohort 1 










Cohort 1: 


MosdvAA schools in 


Readme 


-0.08 




Sxrartz & Johnson (2003) 


Matched (L) 


2 years 

Cohort 2: 




Vocabularv 


-OSS 


-0.11 


94% FL 76% AA. 


Total 


-0.15 






1 year 


16% W. 9%H 


Cohort 2 










Cohort 2 




Readme 


-0.04 










46 y students 






Vocabularv 


-0.17 










(9 IE, 374C) 






Total 


-0.07 




1 

I 


Comer. Greene. & Munroe (2004) 


Matched (L) 


1 year 


63 schools 
(18E.45C) 
12.832 students 
(3.928 E. 8,904 C) 


3-5 


High- poverty schools in 
Philadephia 


Terra Nova 




-0.10 



64 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven Reform in Education (CDDRE) under funding from the 
Institute of Education Sciences, U.S. Department of Education. 



Study 


Deeien 
Large Small 


Duration 


N 


Grade 


Sample Characteristics 


Pastiest 


E ffect Size by 
Subgroup 
Measure 


Overall E ffect Sii 


Whole L ansuas e B aaafe 


Ri^v 














Gates -MaflGinitie 


















S econd Graders 












4 schools 






Word Decodine 


-0.22 












Hi»h-povertv schools. 

80% FL. 57% AA 


Word Knowledge 


-0.07 




Wilkerson (2004) 


Matched (L) 


32 weeks 


(2 E. : C) 


2 and 4 


Comprehension 


-0.23 


-0.26 


472 students 


29% H. 5%W 


Total 


-0.03 








(245 E, 227 C) 




Fourth Graders 


















Vocabulary 


-0.61 
















Comprehension 


-0.33 
















Total 


-0.4S 




Supplanentarv Currie uh 


Schoohvide E nrichmou Readme Model 


Reis. Eckert. NkCoach. Jacobs. & 
Coyne (2008) 






31 teachers 
(17E, 14 C) 
544 students 
(306 E. 238 C) 




2 middle-class schools in 
New England tew ns 36% 


Oral Reading Fluency 


-0.08 




Randomized (L) 


14 w eels 


3-5 


FI, 64% W, 28% H, 3% 
AA, 3% Asian, 18% 

LEP 


ms 


-0.15 


-0.12 


I lenient :: of Readme: Conprehenzian 














Gates -Ma>dG ini tie 


















VocAularv 


-0.21 










1 8 teachers 
(10E. 8C) 
413 students 
(229E. 1S4C) 




Schools in AZ, KY, VA, 


Comprehension 


-0.11 












and OR. 


Total 


-0.17 




Re send ez. Sridiharan. fc Azm (2006) 


Randomized (L) 




3 


69% FL, 36%W, 28% H, 


ERDA 




-0.09 










20% AA, 


Tar set Words in Context 


-0.05 












6% Native American 


Narrative Passase Fluency 


-0.03 
















Informational Passaze Fluencv 


0.00 
















Readme Compr tension 


-0.12 




Elements of Readme: Vocabularv 








7 schools 
26S students 

(147E, 121C) 




High-fiovefty schools in 


Gates -McGini tie 






Apthorp (2005a) 


Randomized Quasi 


1 \ear 


3 


AL and NY. 


VocAulary 


-0.21 


-0.10 


experiment (L) 


83% FL, 49% AA, 


Comprehension 


-0.10 










46% W, 10% LEP 


ERDA Sight VocAulary 


0.00 




Elements of Readme: Fluency 








10 classes 
1S4 students 
(97 E, 87 C) 




Maj on ty White, high- 


ERDA 








Randomized Quasi 
experiment (S) 






poverty Tide I schools 


Word Identification 


0.00 




Apthorp (2005b) 


1 >ear 




74% FL, 82% W, 12% 


Narrative Passage Fluencv 


-0.15 


-0.10 








AA, 4% H, S% LEP 


Informational Passage Fluencv 


-0.18 
















Gates McGinitie C arr*?tehension 


-0.05 





65 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven Reform in Education (CDDRE) under funding from the 
Institute of Education Sciences, U.S. Department of Education. 



Study 


DeEign 
Large Small 


Duration 


N 


Grade 


Sample Characteristic 


Poett€S3t 


E fleet Size by 
Subgroup 
Measure 


Overall E ffect Si* 


Fluencv Formula 




S ivin- Kachala & Bialo (2005) 


Randomized Quasi 
experiment (S) 


1 year 


12 classes 
12S students 
(66E. 62C) 


A 


Suburban districts in Long 
Island. NY. 

20% FL 7%LEP 


Woodcock Pas sage Comprehension 




-0.24 


•Jacob' a Ladder 




S tambaugh (2007) 


Mulcted (S) 


12 weeks 


2 schools 


3-5 


Rural high-po*erty- 
schools in OH. 
27% FL 


fibs 




-0.02 


Contextual!* -Baaed Vocabulary Instruction 


Nelson & S tage (2007) 


Randomized Quasi 
experiment (L) 


3 months 


16 classes 
(SE. 8C) 
308 students 
(159E. 149C) 


3.5 


S chools in midwestem 
district. 

32% FL 70%W, 24% H. 
24*/. LEP 


Gates -MacGinitie 




-0.15 


Comprehension 


-0.27 


Vocabulary 


-0.03 


OuickReedz 


Huxlasr(2006) 


Matched (S) 


12 weeks 


4dasses 
(2E. 2C) 
61 students 
(35E. 26C) 


3 


High-poverty suburban 
school. 

69% FL. 63% AA. 
33%W 


Gates -MacGinitie 




-0.24 


Comprehension 


-0.32 


Rate 


-0.30 


Accuracy 


-0.42 


TOWRE 




Sight Wad 


-0.13 


Decoding 


-0.12 



Note: L=large study -with at least 250 students; S=small study with less than 250 students; E=Experimental; C=Control; SAT-9=Stanford Achievement Test 9th Edition; l'lHS-Icwa Test of Basic Skills; ERDA-Eadv Reading 
Diagnostic Assessement. FL=Fxee reduced -price lunch; White AA= African American; H=Hispamc; LEP= Limited English Proficient. 



66 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven Reform in Education (CDDRE) under funding from the 
Institute of Education Sciences, U.S. Department of Education. 



Upper E 


Table 7 

ementarv Technology Programs 


Study 


Design 


Duration 


N 


Grade 


Sample Characteristics 


Posttest 


Effect Size by 
Subgroup/ 
Measure 


Overall Effect 


Large Small 


Size 


Su 


pplementalCAI Programs 


Academy of Reading 


















Campuzano etal. (2009) 


Randomized (L) 


1 year 


41 teachers 
(22E, 19C) 
S99 students 
(495E, 404C) 


4 


Schools across the U.S. 
65%FL, 54%AA, 29%H, 
17%W 


SAT-10 




-0.01 


Leap Track 


















Campuzano, et al. (2009) 


Randomized (L) 


1 year 


55 teachers 
(29E,26C) 
1274 students 
(665E, 609C) 


4 


Schools across the U.S. 
61%FL. 57/4AA, 33%W, 
10%H 


SAT-10 




+0.09 




















Jostens (Earlier form of C 


imp ass Learning) 


Aliffangis (1991) 


Randomized (S) 


1 year 


12 classes 
(6 E, 6 C) 


4-6 


School at an army base near 
Washington, D.C. 37% 
minority. 


CTBS Reading 




+0.15 


4th grade 


+0.30 


5th grade 


+0.20 


6th grade 


-0.04 


Becker (1994) 


Randomized (S) 


1 year 


1 school 
1S7 students 


2-5 


Inner c ity B altimore 
High poverty'. 


CAT 




+0.09 


Standi sh (1995) 


Notched (S) 


1 year 


2 schools 
139 students 
(56E.S3C) 




S tudents in suburban DE 


NEAT 6 Reading 
Comprehension 




+0.05 


Estep (1997) 


It fetched post hoc (L) 


4 years 


106 schools 

(53E, 53C) 


3 


El emenlaty schools in IN 


ISTEP 






Reading Vocabulary’ 


+0.03 


+0.03 


Reading Total 


+0.03 


Clariana (1994) 


Nfetchedpost hoc (S) 


1 year 


85 students 
(47E. 3 SC) 


3 


School in a predominantly 
White, rural area. 


CTBS 




+0.20 


Compass Learning 


Kadel Research Consulting 
(2006) 


\fetehed post hoc (S) 


2 years 


138 students (69 
E, 224 C) 


4-5 


Garfield Heights, OH 
50% FL, 63% W, 24% H 
13% AA 


OAT 




+0.29 


1 year 


-0.10 


2 years 


+0.29 


CCC Sue cessm alter 


Campbell (2000) 


Matched (L) 


1 year 


13 schools 
(7E.6C) 
701 students 
(310E, 391C) 


4-5 


Middle class students in 
Etowah, AL 


SAT 






Comprehension 


-0.09 


-0.02 


Vocabulary' 


+0.04 



67 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven Reform in Education (CDDRE) under funding from the 
Institute of Education Sciences, U.S. Department of Education. 



Studv 


Design 
Large Small 


Duration 


N 


Grade 


Sample Characteristics 


Posttest 


Effect Szebv 
Subgroup/ 
Measure 


Overall Effect 
Size 


Ragosta (1983) 


Matched (L) 


3 wars 


6 schools 
(4E, 2C) 
Eight 1-war 
cohort 
Three 2 -war 
cohorts 
One 3-war 
cohort 


4-6 


High poverty schools in Los 
Angeles 


CTBS 






One \ear 


+0.17 


Comprehension 


+0.23 


Vocabulary 


+0.25 


Two war s 


Comprehension 


-0.01 


Vocabulary 


+0.17 


Three wars 


Comprehension 


-0.24 


Vocabulary 


+0.58 


Saracho (1982) 


Matched (L) 


1 war 


256 students 
(128E, 128C) 


3-6 


Spanish-speaking migrant 
students 


CTBS Reading 




-0.09 


3rd 


-0.04 


4* 


-0.25 


5 th 


+0.16 


6th 


-0.17 


Glassworks Gold 


Whitaker (2005) 


Afetchedpost hoc (S) 


1 war 


2 schools, 
218 students 


4,5 


Schools in rural Tennessee 
62%LowSES. 


TCAP 




-0.14 


4th 


-0.10 


5th 


-0.19 


Mv Reading Coach 


Vaughan, Serido, & 
Wilhelm (2006) 


Randomized (L) 


1 war 


4 schools 
284 students 
(127E, 157C) 


2-4 


Pre domin ate ly mi noritv 
students from 4 schools in 3 
states; 

27%ELLs, 36% AA. 36% 
H, 22% W 


GRADE 




+0.24 


Vocabulary' 


+0.24 


Comprehension 


+0.22 


WICAT 


Mller (1997) 


^fetched post hoc (L) 


3 wars 


30 schools 
(10E, 20C) 


3-5 


NYC Public Schools; 
Pre domi nantly Afric an 
American and Hispanic . 
17%ESL 


DRP 




+0.02 


Clayton (1992) 


Iv&tched post hoc (L) 


1 war 


5 schools 
(IE. 4C) 
426 students 
(1S1E.245C) 


2-5 


Schools in northwest SC 
46% FL, 59%W, 39®i> AA 


CTBS 




-0.01 


\fcs & Petrie (1988) 


Matched post hoc (L) 


3 wars 


4 schools 
(IE, 3C) 
257 students 
(8 IE. 176C) 


24 


Schools in Dearborn, MI 


TIBS 




-0.15 



68 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven Reform in Education (CDDRE) under funding from the 
Institute of Education Sciences, U.S. Department of Education. 



Studv 


Design 


Duration 


N 


Grade 


Sample Characteristics 


Posttest 


Effect Sizebv 
Suberoup/ 
Measure 


Overall Effect 


Larse'Small 


Size 


Od en Book to Literacy 


Williams (2005) 


\fetched(S) 


1 year 


2 schools 
(IE. 1C) 
127 students 
(66E. 610 


4 


High-poverty schools in 
Memphis; 

51% W, 24% H, 21% AA 


TORC 




+0.28 


Other Sup p lent entalCAI 


Becker (1994) 


Randomized (S) 


1 year 


9 classes 
199 students 


2-5 


Schools in inner city' 
Baltimore 
50% FL. 99% AA 


CAT 




+0.06 


Easterling (1982) 
(McroSvstem 80) 


Randomized (S) 


4 months 


2 schools 
42 students 
(21E, 21C) 


5 


Schools in suburban school 
district 


CAT Reading Comprehension 




+0.05 


Schmidt (1991) 
(Wasatch US) 


\fetched (L) 


1 year 


4 schools 
(2E.2C) 
1,224 students 
(646E.57SC) 


2-6 


Schools in Southern CA 
25% FL 


CTBS 




+0.04 


Cooperman (1985) 


Matched (L) 


1 year 


3 schools 
(1E.2C) 
470 students 
(204E, 266 C) 


2-4 


S tudents from 3 low to 
middle class schools. 
86% W, 13% AA 


CAT 




-0.06 


Bryg (1984) 


\fetched (S) 


15 weeks 


9 teachers 
(5E.4C) 
152 students 
(S3E. 69C1 


4 


Schools in Omaha, NE 


CAT Reading 
Comprehension 




+0.20 


Roth& Beck (1987) 


Mitched(S) 


1 year 


6 classes 
(3E, 3C) 
108 students 
C59E. 490 


4 


Hig h-poverty low-achieving 
urban schools 
100% AA 


Woodcock Word Attack 


+0.60 


+0.38 


CAT Vocabulary 


+0.53 


CAT Reading Comprehension 


0.00 


Coomes (1985) 


\ fetched (S) 


1 year 


4 schools 
102 students 
(5 IE. 510 


4 


Middle class schools in TX 
90% W 


CTBS 




+0.02 


Hoffman (1984) 


Mitched(S) 


1 year 


3 schools 
96 students 
(5 IE. 450 


3 


Schools in suburban 
midwest 
11% minority 


Gates MacGinitie 




-0.07 


Comprehension 


-0.04 


Vocabulary' 


-0.10 


Levy(19S5) 


\fetched post hoc (L) 


1 year 


4 schools 
581 students 
(293E. 2S8C) 


5 


Suburban NY school 
district 


SAT 




+0.19 



69 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven Reform in Education (CDDRE) under funding from the 
Institute of Education Sciences, U.S. Department of Education. 



Studv 


Design 

Laree'Small 


Duration 


N 


Grade 


Sample Characteristics 


Posttest 


Effect Szebv 
Subgroup/ 
Measure 


Overall Effect 
Size 


Computer-Managed Learning Svstems 


Accelerated Reader 














DRS 
















Low SES students in a 


Vocabulary 


+0.25 




Knox (1996) 


Randomised (S) 


3 months 


77 students 


3-4 


southeastern state: 


Comprehension 


-0.13 


-0.03 


(40E, 37C) 


72% FL. 79% W. 13% AA. 


SAT 














$%H 


Vocabulary 


-0.07 
















Comprehension 


-0.17 




Ye e (2007) 


Matched (L) 


1 \ear 


3 schools 
(1E.2C) 
2072 students 
(612E. 1460C) 


2-5 


Majority Hispanic schools 
in Los Angeles: 

92% FL, 79% H 17% AA, 
61% ELL 


CST 




+0.06 


Innovative Technologv ADDlications 


FastForWord 


Nferion (2004) 


Matched (L) 


1 year 


349 students 
(215E, 134C) 


5-6 


Schools in Appalachian TN 
52% FL, 100%W 


Terra Nova 




+0.25 








142 students 
(55E, S7C) 




Middle class schools in 
Northwest OH 


Gates Mac Gi nine 






Scientific Learning (2006) 


Nhtched(S) 


15 weeks 


5-6 


Comprehension 


+0.12 


+0.11 










Vocabularv r 


+0.11 




Lightspan 








101 students 
(50E, 51C) 




Schools in the Caesar 


SAT 






Birch (2002) 


\htched post hoc (S) 


2 vears 


2,3 


Rodnev School District in 


Vocabulary 


+0.59 


+0.42 










EE 


Comprehension 


+0.25 





Note: L=large study with at least 250 students; S=small study with less than 250 students; E=Experimental; C=Control; CTBS=Comprehensive Test ofBasic Skills: CAT=Califomia Achievement Test 
CST= California Standards Test; MAT=Metropolitan Achievement Test ITBS=Iowa Test of Basic Skills; IS TEP= Indiana Statewide Testing for Educational Progress; OAT=Ohio Achievement Test 
TCAP=Tennessee Comprehensive Assessment Program; GRADE=Group Reading Assessment and Diagnostic Examination; DRP=Degrees of Reading Power; \VRAT=\Vide Range Achievement Test 
SAT=Scholastic .Achievement Test DRS=Diagnostic Reading Scales: FL=Free reduced price lunch; W= White, AA= African American, H= Hispanic. ELL =English language learners; LEP= Limited 
English Proficient 



70 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven Reform in Education (CDDRE) under funding from the 
Institute of Education Sciences, U.S. Department of Education. 



Table 8 

UpperE lementarv Instructional Proces Programs 


Studv 


Design 
Large Small 


Duration 


X 


Grade 


Sample Characteristics 


Posttest 


Effect Size br 
Sub group Measure 


Overall 
Effect Size 








Coopera rh e L ea rains 


Coop erative Integrated Reading and Composition (CIRC) 


Stevens and Sla\in(l995a) 


Matched (L) 


2 years 


7 schools 
(3E,4C) 

63 dasses 
(3 IE, 32C) 
1299 students 
(63 5E. 664C) 


2-6 


Working-class suburb of 
Baltimore 
9°oFL, 95 %W 


CAT 




+0.23 


Vocabulary 


-0.20 


Comprehension 


-0.26 


Stevens & Slavin (1995b) 


Matched (L) 


2 years 


5 schools 
(2E.3C) 

45 dasses 
(21E, 24C) 
873 students 
(41 IE, 462C) 


2-6 


Suburban district in 
Maryland 
10% FI, 93 %W 


CAT 




+0.25 


Comprehension 


+0.28 


Vocabulary 


-0.21 


Jenkins et al. (1994) 


Matched (L) 


1 year 


2 schools 
S60 students 

(332 E, 528 C) 


1-6 


Mount Vernon. WA 
36%FL 


MAT 






Comprehension 


+0.09 


+0.18 


Vocabulary' 


+0.31 


Total 


+0.18 


Stevens, Madden, Slavin, & 
Famish (1987; Stud, - 1) 


Matched (L) 


12 weeks 


10 schools 
(6E,4C) 
21 dasses 
(11E, 10C) 


3-4 


Middle-class suburb of 
Baltimore 

4%FL. 84% W, 16%A\ 


CAT 




+0.18 


Comprehension 


+0.19 


"Vocabulary 


+0.17 


Stevens, Madden, Slavin, & 
Famish (1987; Stu<h'2) 


Matched (L) 


6 months 


9 schools 
(4E, 5C) 
22 dasses 
(9E, 13C) 
450 students 


34 


Middle-class suburb of 
Baltimore 

18%FL. 78% W, 22% AA 


CAT 




+0.45 


Comprehension 


+0.35 


Vocabulary 


+0.11 


Total 


+0.23 


Durrell 


+0.54 


Bramlett (1994) 


Matched (L) 


1 year 


S schools 
(9 C, 9 E) 

1 8 classes 
392 students 
(19 SE. 1940 


3 


Rural southern Ohio 


CAT 




+0.08 


Comprehension 


+0.10 


Total Reading 


+0.07 


Word Analysis 


+0.10 


Vocabulary 


+0.03 



71 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven Reform in Education (CDDRE) under funding from the 
Institute of Education Sciences, U.S. Department of Education. 



Studv 


Design 

Large/Small 


Duration 


N 


Grade 


SaniDle Characteristics 


Posttest 


Effect Sizebv 
Sub gnouD/Mea sure 


Overall 
Effect Size 








Rapp (1991) 


Matched (S) 


l year 


2 schools 
(1 E, 1 C) 
SS students 

(43 E, 45 C) 


3 


Working-dass schools in 
Lewistown, ID 


ITBS 




+0.14 


Comprehension 


+0.09 


Vocabulary 


+0.18 


Calderon, Hertz -Lazarowitz, & 
Slavin (199S) 


Matched (S) 


2 years 


7 schools 
(3E, 4C) 
Year 1: 
$4 students 
(5 IE, 33 C) 
Year 2: 
59 students 
(26E, 33 C) 


2 and 3 


Spanish- dominant students 
transitioning to Engli shin 
hi gh-poverty scho ol s ne ar 
die Mexi can border in 
Texas. 

79% H 


STAAS 2nd graders 


+0.30 


+0.87 


NAPT 3rd graders 




1 year 


+0.62 


2 years 


+0.87 


Skeans (1991) 


Matched post hoc 

(L) 


19months 


630 students 
(34S E, 282 C) 


3 and 5 


Suburban district near 
Houston 


MAT: 3rd grade 




-0.03 


Vocabulary 


+020 


Comprehension 


+0.08 


MAT: 5th grade 




Vocabulary 


-0.15 


Comprehension 


-024 


Reader's Theater 


Canick (2000) 


Matched (S) 


IT weeks 


9S students 

(53E, 45 C) 


5 


Urban New Jersey 
80% FL, 85%AA, 11%H 


Compared to control 




+029 


Terra Noya 


+022 


Oral Reading 


+0.50 


Compared to paired 
reading 




Terra Nova 


+0.12 


Oral Reading 


+0.30 


Same- Age Tutoring Programs 


PALS 


Fuchs, Fuchs, Kazdan, & Allen 
(1999) 


Randomized quasi- 
experiment (S) 


2 1 weeks 


45 students 
15 students each in 
PALS, PALS-HG (PALS 
+ tutoring strategies), and 
control 


2-3 


Students in a southeastern 
at y 

24% FL, 62% W, 3 8% AA 


SDRT Reading 
C am prehension 




+0.36 


PALS 


+0.72 


PAL S HG 


0.00 



72 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven Reform in Education (CDDRE) under funding from the 
Institute of Education Sciences, U.S. Department of Education. 



Study 


Design 

Large/Small 


Duration 


X 


Grade 


SamDle Characteristics 


Posttest 


Effect Size by 
Sub gro udAI ea sure 


Overall 
Effect Size 








Same-Age Tutoring + Strate 


gy Instruction 


Van Keer & V erhaeghe (2005) 


Matched (L) 


1 year 


Second graders: 
1 1 classes 
(5E, 6C) 
215 students 
(91E, 124C) 
Fifth graders: 
10 classes 
(4E, 6C) 
208 students 
(10 IE, 107C) 


2,5 


Middle class schools in 
Flanders. Belgium 


DutchReading 
Comprehension Tea 




+0.29 


2nd graders 


-0.17 


5th graders 


*0.40 


Van Keer & V erhaeghe (2008) 


Matched (L) 


1 year 


Second graders: 
12 classes 
(6E, 6C) 
234 students 
(110E, 124C) 
Fifth graders: 
15 classes 
(9E, 6C) 
293 audents 
flS6E. 1070 


2,5 


Middle class schools in 
Flanders. Belgium 


DutchReading 
Comprehension Tea 




+0.24 


2nd graders 


*0.26 


5 th graders 


*0.21 


Cross-Age Tutoring Programs 


Reading Together 


Policy' Studies Associates (2007) 


Randomized (S) 


1 year 


124 students 
(56E, 6SC) 


2 


School in Irving TX 


Terra Nova 




-0.01 


Cro ss-Age T utoring 


Hilger(2000) 


Matched (S) 


1 year 


1 school 
72 students 

(47 E, 35 C) 


3 


Kish- poverty school. 
78% FI; 34%AA, 34% 
Asian. 26% \V,5%H. 


STAR 


*0.16 


+0.37 


Fluency 


*0.58 



73 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven Reform in Education (CDDRE) under funding from the 
Institute of Education Sciences, U.S. Department of Education. 



Studv 


Desian 
Large Small 


Duration 


N 


Grade 


Sample Characteristics 


Posttest 


Effect Size by 
Sub gro nsAI ea sure 


Orerall 
Effect Size 








Cro ss-Aee T utorine + Strategy 


















Van Keer & V erhaeghe (2005) 


Matched (L) 


1 year 


Second graders: 

9 classes 
(3E, 6C) 

190 students 
(66 E, 124C) 
Fifth graders: 

10 classes 
(■IE, 6C) 

276 students 
1169E. 1070 


2,5 


Middle class schools in 
Flanders, Belgium 


DutchReading 
Comprehension Tea 




+0.27 


2nd graders 


^0.22 


5th graders 


->0.32 


Van Keer & V erhaeahe (2008) 


Matched (L) 


1 year 


Second graders: 
14 classes 
(8E, 6C) 
286 students 
(162E. 124C) 
Fifth graders: 
13 classes 
(7E, 6C) 
263 students 
fl56E. 1070 


2,5 


Middle class schools in 
Flanders, Belgium 


DutchReading 
Comprehension Tea 




+0.35 


Second graders 


+0.42 


Fifth graders 


-0.28 


Strategy 


nstruct 


ion 


Belaian Strategy Model 


Van Keer & V erhaeahe (2005) 


Matched (L) 


1 year 


Second graders: 

14 classes 
(SE, 6C) 
287 audents 
(163E, 124C) 
Fifth graders: 
14 classes 
(8E, 6C) 
284 students 
(177E. 1070 


2,5 


Middle class schools in 
Flanders, Belgium 


DutchReading 
Comprehension Tea 




+0.30 


Second araders 


-0.24 


Fifth graders 


^0.35 



74 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven Reform in Education (CDDRE) under funding from the 
Institute of Education Sciences, U.S. Department of Education. 



Studv 


Design 

Large/Small 


Duration 


X 


Grade 


Sample Characteristics 


Posttest 


Effect Size bv 
Sub sro up /M ea sure 


Overall 
Effect Size 








ThinkmgMaps 


Leary (1999) 


Matched (S) 


1 year 


2 schools 
(IE, 1C) 
7S students 
f41E. 370 


4 


High-poverty schools in 
southeast emVA 
79% FL, 69% AA, 3 1% W 


SAT-9 




+0.31 


Hickie (2006) 


Mat died post hoc 

(S) 


2 wars 


2 schools 
(IE, 1C) 
54 students 
(24E, 30 C) 


4-5 


High-poverty white schools 
in northe astern TN 
91%FL 


TCAP 




+0.70 


Foundations and Frameworks 


Blackmon (2008) 


Matched (S) 


1 war 


5 schools 
(3E, 2C) 
103 students 
(52E.51C) 


4-5 


Philadelphia Christian 
schools; 

predominantly AA H 


Gates MacGinitie 




-0.02 


Comprehension 


-0.08 


Vocabulary 


R1.04 


Recro rocal Teaching 


Sparer, Brunstein, & Kieschke 

(2007) 


Matched (S) 


19 weeks 


105 students 


3-6 


Middle-dass schools in 
Germany 


G erm an st andardi z ed 
comprehension test 




+0.57 


Fluencv Instruction 


FORI 


Kuhn et al (2006) 


Randomized 
quasi -experiment 

(s) 


1 war 


5 schools 
(3E. 2C) 
227 audents 
(143E, 84C) 


2 


High poverty schools inNJ 
andGA 

58% FL. 5 1% AA 23% W. 
21% R 5% Asian 


TOWRE 


-i-0.29 


+0.19 


GORT-4 


-K).l 0 


\M4I 


-*-0.1 8 


Structured Phonetic Intervention Programs 


Exeinp larv Center for Reading Instruction (ECRI) 


Reid (1996) 


Matched post hoc 

<L) 


1 war 


5 schools 
(4E, 1C) 
921 students 
('590E. 33 1C1 


2-6 


High-poverty 
schools in eastern TN 
99% W 


SAT 




+0.65 


Comprehension 


+0.71 


Vocabulary 


+0.59 


Cohen (1991) 


Matched post hoc 

(L) 


1 war 


473 students 

(242E.231C) 


3 


Urban school district 
45% AA 34% W, 21% H 


ITBS 




+0.14 


Comprehension 


+0.07 


\'ocabularv 


+0.21 



75 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven Reform in Education (CDDRE) under funding from the 
Institute of Education Sciences, U.S. Department of Education. 



Studv 


Design 
Large'S m a 11 


Duration 


N 


Grade 


Samnle Characteristics 


Posttest 


Effect Size by 
Sub gro ut> /M ea sure 


Overall 
Effect Size 








Phonics-Based ProfessionalPeveloDment 


Language Essentials for Teachers of Reading and S 


o elling (LE TRS) 


Garet et al. (2008) 


Randomized (L) 


1 year 


90 schools 
5530 students 
(1983 LETRS, 
173SLETRS - 
Coaching 
1809 0 


2 


6 urban districts 

78% FL, 78%AA, 15%W, 
5%H 


Various state 
assessments 




+0.06 


LETRS 


+0.08 


LETRS + C oaching 


+0.03 


Integrated Language Arts Pro grams 


Literature -Based Program 




Morrow (1992) 


Randomized quasi- 
experiment (S) 


1 year 


9 classes 
166 students 
(56 LBP- parents, 
46LBP only. 
64C) 


2 


Students in two suburban 
schools inNJ 

24% FL, 43% AA, 37% W, 
14% Asian 


CAT 




+0.21 


School - home 


+021 


School only - 


-0.20 


Success in Reading and Writing 




Lindsey (198 8) 


Matched (S) 


1 year 


2 schools 
(IE, 1C) 
9' students 
(56E. 41C) 


2-3 


Elementary - schools in die 
Padfic Northwest 


CAT 




-0.11 


Comprehension 


-0.23 


Vocabulary 


+0.01 


C'arbo Reading Sh ies 




Oglesby & Suter (1995) 


Matched (S) 


1 year 


13 dasses 
(6 E, 7 C) 
19S students 
(105 E, 93 C) 


3 and 6 


Urban school in die mid- 
south 

80% AA, 20%W, 81% 
remedial. 


Gates MacGinide 




+0.27 


Classroom Management and Motivation Programs 


Consistency Management-CooDerath eDiscioline <C 


MCD) 


Freiberg Prokosch, Treiser, & 
Stein (1990) 


Matched post hoc 

<L) 


2 years 


10 schools 
(5E. 5C) 
699 students 
(364E, 33 5C) 


2-5 


Kigh-poverty schools in 
Houston 

72%FL, 90%AA 


MAT-6 
(grades 2-5) 


+0.09 


+0.12 


TEAMS 
(grades 3 and 5) 


+0.14 



76 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven Reform in Education (CDDRE) under funding from the 
Institute of Education Sciences, U.S. Department of Education. 



Studv 


Design 
Large Small 


Duration 


N 


Grade 


Sample Characteristics 


Posttest 


Effect Sizebv 
Sub gro up /M ea sure 


Overall 
Effect Size 








Opuni (2006) 


Matched post hoc 

(L) 


l year 


14 schools 
(7E,7C) 
456 students 
(22 8E. 2280 


3 


High-poverty schools in 
Newark, NJ 
78% FL, 90% AA 


SAT-9 




+0.26 


Student Success Skills 


Campbell andBrigman(2005) 


Randomized (L) 


6 months 


20 schools 
480 students 
(240E.240C) 


5 -6 


Low-achieving students in 
Honda 

62% FL, 82% W’, 9% AA 
5%H 


FCAT 




+0.23 


Resp o nshe Classroo m 


Rimm -Kaufman Fan, Chiu, & 

You (2007) 


Matched post hoc 

(I) 


3 years 


6 schools 
(3E,3T) 

3 groups: 
grades 2-5 
381 students 
(21 IE, 170C) 
grades 3-5 
502 students 
(282E.220C) 
grades 4-5 
506 students 
(266E, 240C) 


2-5 


Schools in a northeastern 
urban district, 

35%FL, 57% W’, 22% AA 
21% H 


DRP 




+0.15 


Grades 2-5 


-K1.21 


Grades 3-5 


+0.16 


Grades 4-5 


+0.07 



Note: L=large study with at least 2 50 students; S=small study withless than 250 students; E=Experimental; C=Cantrol; CAT=California Achievement Test MAT=Metropolitan Achievement 
Test ITBS=Iowa Tests of Basic Skills; STAAS=Texas Assessment of Academic Skills-Spanish;NAPT-Norm -Referenced Assessment Pro gram for Texas; SDRT=Stanford Diagnostic 
Reading Test; SAT=Stanford Achievement Test; TC AP=Tennessee Comprehensive Assessment Program; PAL S=Peer- Assisted Learning Strategies; PAL S-HG=Peer-Assisted Learning 
Strategies with Help-Giving Training; TOWRE=Test of Word R e ading E ffi a ency GORT=Gray Oral Reading Test; GRADE =Group Reading Assessment and Diagnostic Examination; 
STAR=Standardized Test for Assessment of Reading; \\IAT= Wechsler Individaul Achievement Test TEAMS=Texas State Assessment of Academic Skills; SAT=Scholastic Achievement 
Test DRP=Degrees of Reading Power; FCAT=Flondas Comprehensive Assessment Test FL= Free Reduced lunch W= White, AA=Afri can American H=Hispanic, CTBS=C omprehensive 
Test of Basic Skills. 



77 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven Reform in Education (CDDRE) under funding from the 
Institute of Education Sciences, U.S. Department of Education. 



